From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f46.google.com (mail-dl1-f46.google.com [74.125.82.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F0EF385D7E for ; Sat, 23 May 2026 09:43:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529416; cv=none; b=lz/Iq7gxfRpTpywBJvq+VHUxUk4dIlyWEscKdQdbNeWsetYDY8rFTpXJkUYIsJNWpI0qg1y+oow9zD7y0IJHhD8GnCIFAKdj7fdTVORTaNVBWLIshMrr7k2kwkIvlescGdKgqUmrUixtPmx9kg+rrS+l5tqXPbomuFg2wF5SFXc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529416; c=relaxed/simple; bh=bc329ZjKfsjEkL6SZyy8g7yHEPrn1HJgDl86cAwK910=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z+xO1RpjNBb+F7VDbJlK9cbi+gCm1Gvf0edteE8BOO3MefuN5nSQC2jBj7vqspsTskPQSt7erUuuPpMD/qINdv/WLSh0NzUkNzEpCp9UUzBTCTl6lvfJwRElhfUrlSSoiT74jkkTFceACK28LKNhlr40cmcJB3LvyULj5K+pW7s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=od0lZvQX; arc=none smtp.client-ip=74.125.82.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="od0lZvQX" Received: by mail-dl1-f46.google.com with SMTP id a92af1059eb24-1334825de43so7174923c88.0 for ; Sat, 23 May 2026 02:43:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529414; x=1780134214; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=M3OwZ94U8bDjUMg9dyoIVeXc+EhCZY7nOy8TPZRIabU=; b=od0lZvQXbNiXvjJkRt+WhMaFnJbjTHzdsf/wm8+YAymdd9TxqH/vAzljtkAHNPwYF2 5YfZU33iixyziWXnydzTr/GZDBNbpkCCr6q49FEnqROPcimvv3WzDafDk6fsfSARELcm oVGP1Im10/GAelM866uBPLG+1mLrpM+85BQDHdBClHkehg+kEa8p1v2Qom82yq3iyzY9 fGLQlHpjKi9yZmubhKrmqlpDIi8Fgm3jAFiEda4O32rXHNlmu1VGyKUF/260ehnxvQOW dbDYmLsAPErN0A2BPAtMDjxVzngrC6pBzJBuXX4pZJiAmbwjkIpOWqrh7nV4HTAQHKLV M4EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529414; x=1780134214; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=M3OwZ94U8bDjUMg9dyoIVeXc+EhCZY7nOy8TPZRIabU=; b=VLsJIQRBcfS5PIEYu6bmy5FysHueRHtlABwTPeO5KUlBAOFSH3NHUG3unXyl6suwej QRENQJMbEX2cbio31oUzZuDhc0k9fSGBwa+3EavSkOUm9sPnjMfW7xkwlqJpsBV/UFGC 8EcqsjhaA1quW6m57rncrN6KgvbYHT49xBVfKNQVZGIsQLtoYiSvtDHw2ecyApH2/fhr BJlBAzgP7ceoxsOxWV51qK4Vzd1wNRiDVDOwxjuH/pDjt6KrMhABzTfWb9+UFsN1oUtg r4bOTh+Q5SC5nZbyDElLx0Cn/jDcEYSNYJolL6B94jPcAyae6s1ujw3l1+Z88YQo3UZb EiEg== X-Forwarded-Encrypted: i=1; AFNElJ8DFh5Bf6YrpbNp2keTu3cT2cEjhXYlyImQhzzANJaz7sE4kpEUnna2+ivJXoKeqB/PD3ayXB+VYnGQXnY=@vger.kernel.org X-Gm-Message-State: AOJu0YzkUNnVQD87bNUzh7cF2vWJOcpts9NfzI/te+rUJZYQl/j6BPMO aJFXXxNNhSxxY6Xqj6eGVnRwkdFA43rGLdqZPFoqClCXODl8JZf3hzdLZKJ0dw== X-Gm-Gg: Acq92OGIPyVlQ5v1ZFr1ie3P5mZ/qWF6mQC/YSHjySiw7OXMEykmO/uvO/fDSUqoccA ji7EkbqVIYMy6wxjF+kJs9Ybb+aYv0viseWpDSik7ZmUZC543b2wqdxelDCTwtBrNAYv6az1nSe DdW+q7Lzc5Y20tVEImSC3SCZIBHvzDxx2uO2KIRpTRWOwHsBnmxYVYEx3QRwutcexvaIxjo854p 9FyEWNjIcTjgfcSA8g0Nqqqi5Q+y9aJEoImK/xxm9EIQ17fdRd2wrDyriJX3KXH9oiwoAZiI/On wGPhZbDxUdtooseZvnqjk4GAVxWOqA6xceTrLJpWLizNzot8kYQYmeKihwjY5x9jjjTIglzKl6i brQMC3FlthFzWpLyGYbQ+mCHBJa8CK1UFnM9iAHjU8W7nhaOkWmcvH5MPJgVi5hGq0MiOzzQYtI uA5AenEA2w4YFJJB+iqTxfYtg06W0FxPvHAT1NBEO91pyTFukn9ICK1IPuDBhv00/6hmeR1p/7S bBxafU= X-Received: by 2002:a05:7022:f313:b0:136:9ebf:3bef with SMTP id a92af1059eb24-1369ebf3d04mr123434c88.26.1779529414299; Sat, 23 May 2026 02:43:34 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:33 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 01/31] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) Date: Sat, 23 May 2026 02:42:55 -0700 Message-ID: <4700826deb086665c9e1c643156864eaecfe1fef.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Per the CXL 3.1 specification software must check the Command Effects Log (CEL) for dynamic capacity command support. Detect support for the DCD commands while reading the CEL, including: Get DC Config Get DC Extent List Add DC Response Release DC Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] --- drivers/cxl/core/mbox.c | 43 +++++++++++++++++++++++++++++++++++++++++ drivers/cxl/cxlmem.h | 15 ++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index aaa5c6277ebf..7ef5708bf210 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -165,6 +165,42 @@ static void cxl_set_security_cmd_enabled(struct cxl_se= curity_state *security, } } =20 +static bool cxl_is_dcd_command(u16 opcode) +{ +#define CXL_MBOX_OP_DCD_CMDS 0x48 + + return (opcode >> 8) =3D=3D CXL_MBOX_OP_DCD_CMDS; +} + +static void cxl_set_dcd_cmd_enabled(struct cxl_memdev_state *mds, u16 opco= de, + unsigned long *cmd_mask) +{ + switch (opcode) { + case CXL_MBOX_OP_GET_DC_CONFIG: + set_bit(CXL_DCD_ENABLED_GET_CONFIG, cmd_mask); + break; + case CXL_MBOX_OP_GET_DC_EXTENT_LIST: + set_bit(CXL_DCD_ENABLED_GET_EXTENT_LIST, cmd_mask); + break; + case CXL_MBOX_OP_ADD_DC_RESPONSE: + set_bit(CXL_DCD_ENABLED_ADD_RESPONSE, cmd_mask); + break; + case CXL_MBOX_OP_RELEASE_DC: + set_bit(CXL_DCD_ENABLED_RELEASE, cmd_mask); + break; + default: + break; + } +} + +static bool cxl_verify_dcd_cmds(struct cxl_memdev_state *mds, unsigned lon= g *cmds_seen) +{ + DECLARE_BITMAP(all_cmds, CXL_DCD_ENABLED_MAX); + + bitmap_fill(all_cmds, CXL_DCD_ENABLED_MAX); + return bitmap_equal(cmds_seen, all_cmds, CXL_DCD_ENABLED_MAX); +} + static bool cxl_is_poison_command(u16 opcode) { #define CXL_MBOX_OP_POISON_CMDS 0x43 @@ -757,6 +793,7 @@ static void cxl_walk_cel(struct cxl_memdev_state *mds, = size_t size, u8 *cel) struct cxl_mailbox *cxl_mbox =3D &mds->cxlds.cxl_mbox; struct cxl_cel_entry *cel_entry; const int cel_entries =3D size / sizeof(*cel_entry); + DECLARE_BITMAP(dcd_cmds, CXL_DCD_ENABLED_MAX); struct device *dev =3D mds->cxlds.dev; int i, ro_cmds =3D 0, wr_cmds =3D 0; =20 @@ -785,11 +822,17 @@ static void cxl_walk_cel(struct cxl_memdev_state *mds= , size_t size, u8 *cel) enabled++; } =20 + if (cxl_is_dcd_command(opcode)) { + cxl_set_dcd_cmd_enabled(mds, opcode, dcd_cmds); + enabled++; + } + dev_dbg(dev, "Opcode 0x%04x %s\n", opcode, enabled ? "enabled" : "unsupported by driver"); } =20 set_features_cap(cxl_mbox, ro_cmds, wr_cmds); + mds->dcd_supported =3D cxl_verify_dcd_cmds(mds, dcd_cmds); } =20 static struct cxl_mbox_get_supported_logs *cxl_get_gsl(struct cxl_memdev_s= tate *mds) diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 776c50d1db51..53444af448d7 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -230,6 +230,15 @@ struct cxl_event_state { struct mutex log_lock; }; =20 +/* Device enabled DCD commands */ +enum dcd_cmd_enabled_bits { + CXL_DCD_ENABLED_GET_CONFIG, + CXL_DCD_ENABLED_GET_EXTENT_LIST, + CXL_DCD_ENABLED_ADD_RESPONSE, + CXL_DCD_ENABLED_RELEASE, + CXL_DCD_ENABLED_MAX +}; + /* Device enabled poison commands */ enum poison_cmd_enabled_bits { CXL_POISON_ENABLED_LIST, @@ -405,6 +414,7 @@ static inline struct cxl_dev_state *mbox_to_cxlds(struc= t cxl_mailbox *cxl_mbox) * @partition_align_bytes: alignment size for partition-able capacity * @active_volatile_bytes: sum of hard + soft volatile * @active_persistent_bytes: sum of hard + soft persistent + * @dcd_supported: all DCD commands are supported * @event: event log driver state * @poison: poison driver state info * @security: security driver state info @@ -424,6 +434,7 @@ struct cxl_memdev_state { u64 partition_align_bytes; u64 active_volatile_bytes; u64 active_persistent_bytes; + bool dcd_supported; =20 struct cxl_event_state event; struct cxl_poison_state poison; @@ -485,6 +496,10 @@ enum cxl_opcode { CXL_MBOX_OP_UNLOCK =3D 0x4503, CXL_MBOX_OP_FREEZE_SECURITY =3D 0x4504, CXL_MBOX_OP_PASSPHRASE_SECURE_ERASE =3D 0x4505, + CXL_MBOX_OP_GET_DC_CONFIG =3D 0x4800, + CXL_MBOX_OP_GET_DC_EXTENT_LIST =3D 0x4801, + CXL_MBOX_OP_ADD_DC_RESPONSE =3D 0x4802, + CXL_MBOX_OP_RELEASE_DC =3D 0x4803, CXL_MBOX_OP_MAX =3D 0x10000 }; =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f45.google.com (mail-dl1-f45.google.com [74.125.82.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25E5738734A for ; Sat, 23 May 2026 09:43:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529420; cv=none; b=BJBh/lAohiAqSpcrUXdb5BCxGEN92T63uNRxPvmLeO6gvUmWsAVZXZ4XPR9NZToY+K6uBkjp1itog6BhKrMcpCCPUorS9jiWCYyvK8RD2T9Css87Vg+yE0N1U08gypxXglP7qcHiSfUyXQnU2MzZe3/EoZnvnm4AiY6FKoCxZVQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529420; c=relaxed/simple; bh=dV4P3xSGrsj2qIQxG9IT9QlcLXBACHur0k+875ZgeJU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=B8C+TpXI4vqmIi9e87SEsfEmDqWKMzxKWv86RlMwCFatGBsprsUK5wJXw7vvSddDOUb1lftX0oMewUE+IxusaN7efiKYlt/Z8F2hqvrNE0/5Cq7B94mXjVj+02C2FLFGtGseO1BHh4Bfz1C4V+y82SbJ7AJMOtg0wwenSoyWLk4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YINooaME; arc=none smtp.client-ip=74.125.82.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YINooaME" Received: by mail-dl1-f45.google.com with SMTP id a92af1059eb24-12ddbe104ccso6054060c88.0 for ; Sat, 23 May 2026 02:43:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529417; x=1780134217; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9BE1R9+1yOEFEgejJi2pBegPCPHKSXEFwRMfiz9Ud5k=; b=YINooaMERYLQDMsPgeWYbdEj2T9UeajHTgttijXJoHNeIvzvCNikQixDHEVr6OKZWo f8Qb/KJ+jD0k/nLC4vTPQAbBaeN6/gxKmb7rdNVXU6TfKpvFkV9E9EhfWjzK3salEDJz wGoqEYPJ3pD26m8n8a05yKItlZ77pEDHyKQybak6jNNBj0BBASY7fr4Cr+S6DTL9CVQb NuWagq7iw/+mdZeidSNO5S1aVwdQ58aCI/8GSX6UdQRrwtYkBTySs2237wsfexwiIxFX 5La8Ww8pENGkUa3gznHWddrFSzGgrEnf0aSgsq1QVmZwzFVkKbFuHxDWRQIbPnFMPuXr RAVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529417; x=1780134217; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=9BE1R9+1yOEFEgejJi2pBegPCPHKSXEFwRMfiz9Ud5k=; b=ox44Su1sUFeue4eoAlgO3i5KIuLC2b7vSv/dJ/OxGMuNjiNXrppuUDVvDXgnhZ9LKz hmrw/YXdoOS4sfnryNS6yDtwWKgPG8V4Z81Q3xwfYYLs7mNXIeZCIHQzioy1TuGvqu0l ifv92admL1Uz59MQGKYSMCLwKfCA+2YTIwOXDRaBcx4UmprQMjfkow7TSFRcXI2krpWX XO7Y1l6+JG7MA+DIKQBSBym5GWYhKW6efvpI+VbU61n2uCEPWMy8kcftFqtL3f1HJ1bc HvAdIePIBhKhmx2ft6pqkmyhiESIEaIH4m3x4p7qHcTflVGapA7bF9Rfd9bqVN7c2T8w IvUw== X-Forwarded-Encrypted: i=1; AFNElJ+1cjkn+0CWp+J2bBQGpZyxvomy2ihxggtsfzn4+AEK3fFTPFyad259B1UZ6aJuF5+NrHe3dmse2O54H38=@vger.kernel.org X-Gm-Message-State: AOJu0Yx/waRQ+8qWNi0iGOeYrgByl9d6R1vae45xMlCAK2OfEFsYpUm/ h+SlpesA3kBUIZBbEMUg8YFEbHxCM9eJRzryfCsG1TJE0wE5TidWb9Eg X-Gm-Gg: Acq92OGs1vy7jhoFLbEKJ7EupBTkZQ5qVnuzmlmIxABZ+xpeczlmTOcervqqXWDUH6n V7HHoqh7lRUoJohCwRJrCWz4nheZtKDfN+UcDD1th9leTKshEaMTMYrb+0XpzcvJ2kdO0N7SjeV Ly1wrnN4mOZjSXoJkCsSR28hl4Jf73OCFP7+G3+rOr0utPTes4T3WENELjg90j8/D1RL0oCuuBR o8GPCi2a384Qa8Py99I45PtZQhiupc+k+C7CqEbEz26XCEt/8tqRzSxCVc29OoOkbnF0zq9dKdC R25OghYLzS7qr9ERzm+LHJ7pOy2MSmyLBDVF7doLPH9sgLNtvj/a0sVNQZPn+wih4iokL3QpkMT qnkNJbGu+4dtmFhLVdpP0B4ZPtEyQqjuv1JrmyZ1WPhH84OrpNZ+/wsB+u686uyPsiGb4ykJWk9 n1KhLma8/xU76GmixM8wD2JyufmBHp9kS+wtFrdUgmfBdhlVmCGa06yIlyAFBSvL3cZ2JZRM4m7 gxU8qM= X-Received: by 2002:a05:7022:699f:b0:128:d5bd:3572 with SMTP id a92af1059eb24-1365fb40573mr2368856c88.31.1779529417125; Sat, 23 May 2026 02:43:37 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:35 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 02/31] cxl/mem: Read dynamic capacity configuration from the device Date: Sat, 23 May 2026 02:42:56 -0700 Message-ID: <692890d6934d844cbbe90596499b28833e45f4f5.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Devices which optionally support Dynamic Capacity (DC) are configured via mailbox commands. CXL 3.2 section 9.13.3 requires the host to issue the Get DC Configuration command in order to properly configure DCDs. Without the Get DC Configuration command DCD can't be supported. Implement the DC mailbox commands as specified in CXL 3.2 section 8.2.10.9.9 (opcodes 48XXh) to read and store the DCD configuration information. Disable DCD if an invalid configuration is found. Linux has no support for more than one dynamic capacity partition. Read and validate all the partitions but configure only the first partition as 'dynamic ram A'. Additional partitions can be added in the future if such a device ever materializes. Additionally is it anticipated that no skips will be present from the end of the pmem partition. Check for an disallow this configuration as well. Linux has no use for the trailing fields of the Get Dynamic Capacity Configuration Output Payload (Total number of supported extents, number of available extents, total number of supported tags, and number of available tags). Avoid defining those fields to use the more useful dynamic C array. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] [jonathan: mbox.c: use max possible size for get_dc_config command to avoid vmalloc] [jonathan & fan: cxlmem.h: remove unused struct cxl_mem_dev_info] --- drivers/cxl/core/hdm.c | 2 + drivers/cxl/core/mbox.c | 182 ++++++++++++++++++++++++++++++++++++++++ drivers/cxl/cxlmem.h | 47 +++++++++++ drivers/cxl/pci.c | 3 + include/cxl/cxl.h | 3 +- 5 files changed, 236 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 3930e130d6b6..28974adaab75 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -453,6 +453,8 @@ static const char *cxl_mode_name(enum cxl_partition_mod= e mode) return "ram"; case CXL_PARTMODE_PMEM: return "pmem"; + case CXL_PARTMODE_DYNAMIC_RAM_A: + return "dynamic_ram_a"; default: return ""; }; diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 7ef5708bf210..71b29cd6abfe 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1351,6 +1351,156 @@ int cxl_mem_sanitize(struct cxl_memdev *cxlmd, u16 = cmd) return -EBUSY; } =20 +static int cxl_dc_check(struct device *dev, struct cxl_dc_partition_info *= part_array, + u8 index, struct cxl_dc_partition *dev_part) +{ + size_t blk_size =3D le64_to_cpu(dev_part->block_size); + size_t len =3D le64_to_cpu(dev_part->length); + + part_array[index].start =3D le64_to_cpu(dev_part->base); + part_array[index].size =3D le64_to_cpu(dev_part->decode_length); + part_array[index].size *=3D CXL_CAPACITY_MULTIPLIER; + + /* Check partitions are in increasing DPA order */ + if (index > 0) { + struct cxl_dc_partition_info *prev_part =3D &part_array[index - 1]; + + if ((prev_part->start + prev_part->size) > + part_array[index].start) { + dev_err(dev, + "DPA ordering violation for DC partition %d and %d\n", + index - 1, index); + return -EINVAL; + } + } + + if (!IS_ALIGNED(part_array[index].start, SZ_256M) || + !IS_ALIGNED(part_array[index].start, blk_size)) { + dev_err(dev, "DC partition %d invalid start %zu blk size %zu\n", + index, part_array[index].start, blk_size); + return -EINVAL; + } + + if (part_array[index].size =3D=3D 0 || len =3D=3D 0 || + part_array[index].size < len || !IS_ALIGNED(len, blk_size)) { + dev_err(dev, "DC partition %d invalid length; size %zu len %zu blk size = %zu\n", + index, part_array[index].size, len, blk_size); + return -EINVAL; + } + + if (blk_size =3D=3D 0 || blk_size % CXL_DCD_BLOCK_LINE_SIZE || + !is_power_of_2(blk_size)) { + dev_err(dev, "DC partition %d invalid block size; %zu\n", + index, blk_size); + return -EINVAL; + } + + dev_dbg(dev, "DC partition %d start %zu start %zu size %zu\n", + index, part_array[index].start, part_array[index].size, + blk_size); + + return 0; +} + +/* Returns the number of partitions in dc_resp or -ERRNO */ +static int cxl_get_dc_config(struct cxl_mailbox *mbox, u8 start_partition, + struct cxl_mbox_get_dc_config_out *dc_resp, + size_t dc_resp_size) +{ + struct cxl_mbox_get_dc_config_in get_dc =3D (struct cxl_mbox_get_dc_confi= g_in) { + .partition_count =3D CXL_MAX_DC_PARTITIONS, + .start_partition_index =3D start_partition, + }; + struct cxl_mbox_cmd mbox_cmd =3D (struct cxl_mbox_cmd) { + .opcode =3D CXL_MBOX_OP_GET_DC_CONFIG, + .payload_in =3D &get_dc, + .size_in =3D sizeof(get_dc), + .size_out =3D dc_resp_size, + .payload_out =3D dc_resp, + .min_out =3D 8, + }; + int rc; + + rc =3D cxl_internal_send_cmd(mbox, &mbox_cmd); + if (rc < 0) + return rc; + + dev_dbg(mbox->host, "Read %d/%d DC partitions\n", + dc_resp->partitions_returned, dc_resp->avail_partition_count); + return dc_resp->partitions_returned; +} + +/** + * cxl_dev_dc_identify() - Reads the dynamic capacity information from the + * device. + * @mbox: Mailbox to query + * @dc_info: The dynamic partition information to return + * + * Read Dynamic Capacity information from the device and return the partit= ion + * information. + * + * Return: 0 if identify was executed successfully, -ERRNO on error. + * on error only dynamic_bytes is left unchanged. + */ +int cxl_dev_dc_identify(struct cxl_mailbox *mbox, + struct cxl_dc_partition_info *dc_info) +{ + struct cxl_dc_partition_info partitions[CXL_MAX_DC_PARTITIONS]; + struct device *dev =3D mbox->host; + size_t dc_resp_size =3D + sizeof(struct cxl_mbox_get_dc_config_out) + sizeof(partitions); + u8 start_partition; + u8 num_partitions; + + struct cxl_mbox_get_dc_config_out *dc_resp __free(kfree) =3D + kmalloc(dc_resp_size, GFP_KERNEL); + if (!dc_resp) + return -ENOMEM; + + /** + * Read and check all partition information for validity and potential + * debugging; see debug output in cxl_dc_check() + */ + start_partition =3D 0; + num_partitions =3D 0; + do { + int rc, i, j; + + rc =3D cxl_get_dc_config(mbox, start_partition, dc_resp, dc_resp_size); + if (rc < 0) { + dev_err(dev, "Failed to get DC config: %d\n", rc); + return rc; + } + + num_partitions +=3D rc; + + if (num_partitions < 1 || num_partitions > CXL_MAX_DC_PARTITIONS) { + dev_err(dev, "Invalid num of dynamic capacity partitions %d\n", + num_partitions); + return -EINVAL; + } + + for (i =3D start_partition, j =3D 0; i < num_partitions; i++, j++) { + rc =3D cxl_dc_check(dev, partitions, i, + &dc_resp->partition[j]); + if (rc) + return rc; + } + + start_partition =3D num_partitions; + + } while (num_partitions < dc_resp->avail_partition_count); + + /* Return 1st partition */ + dc_info->start =3D partitions[0].start; + dc_info->size =3D partitions[0].size; + dev_dbg(dev, "Returning partition 0 %zu size %zu\n", + dc_info->start, dc_info->size); + + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_dev_dc_identify, "CXL"); + static void add_part(struct cxl_dpa_info *info, u64 start, u64 size, enum = cxl_partition_mode mode) { int i =3D info->nr_partitions; @@ -1421,6 +1571,38 @@ int cxl_get_dirty_count(struct cxl_memdev_state *mds= , u32 *count) } EXPORT_SYMBOL_NS_GPL(cxl_get_dirty_count, "CXL"); =20 +void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *= info) +{ + struct cxl_dc_partition_info dc_info =3D { 0 }; + struct device *dev =3D mds->cxlds.dev; + size_t skip; + int rc; + + rc =3D cxl_dev_dc_identify(&mds->cxlds.cxl_mbox, &dc_info); + if (rc) { + dev_warn(dev, + "Failed to read Dynamic Capacity config: %d\n", rc); + cxl_disable_dcd(mds); + return; + } + + /* Skips between pmem and the dynamic partition are not supported */ + skip =3D dc_info.start - info->size; + if (skip) { + dev_warn(dev, + "Dynamic Capacity skip from pmem not supported: %zu\n", + skip); + cxl_disable_dcd(mds); + return; + } + + info->size +=3D dc_info.size; + dev_dbg(dev, "Adding dynamic ram partition A; %zu size %zu\n", + dc_info.start, dc_info.size); + add_part(info, dc_info.start, dc_info.size, CXL_PARTMODE_DYNAMIC_RAM_A); +} +EXPORT_SYMBOL_NS_GPL(cxl_configure_dcd, "CXL"); + int cxl_arm_dirty_shutdown(struct cxl_memdev_state *mds) { struct cxl_mailbox *cxl_mbox =3D &mds->cxlds.cxl_mbox; diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 53444af448d7..87386488ad10 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -380,6 +380,8 @@ struct cxl_security_state { struct kernfs_node *sanitize_node; }; =20 +#define CXL_MAX_DC_PARTITIONS 8 + static inline resource_size_t cxl_pmem_size(struct cxl_dev_state *cxlds) { /* @@ -664,6 +666,31 @@ struct cxl_mbox_set_shutdown_state_in { u8 state; } __packed; =20 +/* See CXL 3.2 Table 8-178 get dynamic capacity config Input Payload */ +struct cxl_mbox_get_dc_config_in { + u8 partition_count; + u8 start_partition_index; +} __packed; + +/* See CXL 3.2 Table 8-179 get dynamic capacity config Output Payload */ +struct cxl_mbox_get_dc_config_out { + u8 avail_partition_count; + u8 partitions_returned; + u8 rsvd[6]; + /* See CXL 3.2 Table 8-180 */ + struct cxl_dc_partition { + __le64 base; + __le64 decode_length; + __le64 length; + __le64 block_size; + __le32 dsmad_handle; + u8 flags; + u8 rsvd[3]; + } __packed partition[] __counted_by(partitions_returned); + /* Trailing fields unused */ +} __packed; +#define CXL_DCD_BLOCK_LINE_SIZE 0x40 + /* Set Timestamp CXL 3.0 Spec 8.2.9.4.2 */ struct cxl_mbox_set_timestamp_in { __le64 timestamp; @@ -787,9 +814,18 @@ enum { int cxl_internal_send_cmd(struct cxl_mailbox *cxl_mbox, struct cxl_mbox_cmd *cmd); int cxl_dev_state_identify(struct cxl_memdev_state *mds); + +struct cxl_dc_partition_info { + size_t start; + size_t size; +}; + +int cxl_dev_dc_identify(struct cxl_mailbox *mbox, + struct cxl_dc_partition_info *dc_info); int cxl_await_media_ready(struct cxl_dev_state *cxlds); int cxl_enumerate_cmds(struct cxl_memdev_state *mds); int cxl_mem_dpa_fetch(struct cxl_memdev_state *mds, struct cxl_dpa_info *i= nfo); +void cxl_configure_dcd(struct cxl_memdev_state *mds, struct cxl_dpa_info *= info); struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev, u64 s= erial, u16 dvsec); void set_exclusive_cxl_commands(struct cxl_memdev_state *mds, @@ -803,6 +839,17 @@ void cxl_event_trace_record(struct cxl_memdev *cxlmd, const uuid_t *uuid, union cxl_event *evt); int cxl_get_dirty_count(struct cxl_memdev_state *mds, u32 *count); int cxl_arm_dirty_shutdown(struct cxl_memdev_state *mds); + +static inline bool cxl_dcd_supported(struct cxl_memdev_state *mds) +{ + return mds->dcd_supported; +} + +static inline void cxl_disable_dcd(struct cxl_memdev_state *mds) +{ + mds->dcd_supported =3D false; +} + int cxl_set_timestamp(struct cxl_memdev_state *mds); int cxl_poison_state_init(struct cxl_memdev_state *mds); int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index bace662dc988..60f9fa05d9ef 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -870,6 +870,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const st= ruct pci_device_id *id) if (rc) return rc; =20 + if (cxl_dcd_supported(mds)) + cxl_configure_dcd(mds, &range_info); + rc =3D cxl_dpa_setup(cxlds, &range_info); if (rc) return rc; diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h index fa7269154620..bb1df0cef863 100644 --- a/include/cxl/cxl.h +++ b/include/cxl/cxl.h @@ -133,6 +133,7 @@ struct cxl_dpa_perf { enum cxl_partition_mode { CXL_PARTMODE_RAM, CXL_PARTMODE_PMEM, + CXL_PARTMODE_DYNAMIC_RAM_A, }; =20 /** @@ -147,7 +148,7 @@ struct cxl_dpa_partition { enum cxl_partition_mode mode; }; =20 -#define CXL_NR_PARTITIONS_MAX 2 +#define CXL_NR_PARTITIONS_MAX 3 =20 /** * struct cxl_dev_state - The driver device state --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f46.google.com (mail-dl1-f46.google.com [74.125.82.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D8B1384CF9 for ; Sat, 23 May 2026 09:43:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529424; cv=none; b=lyzg/8/7Jy93D8W68moJmLy7T1cyJQ4IMl1eyS9hGE92pHIL2gRWIlK4JZti6TTn9utVeaUJ4LzoO7R0hBT9KptqeJU1WEveWHdnVsHqFbQlB1j0BsgVm9frmNbxDv2yPJ6e3JBPA93Q72itFR4wZ24Q9pSejcyqGSQDNFBhUf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529424; c=relaxed/simple; bh=kE5rNago1b+pyJpsmlKzqPBUNfF637c14Fgd6EpHYHQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oD7dIRJeeoCmdk0WFy2QW9RkGrn7B8q8ZCV1sxZCXxUA45SNUZ6vZ+ooz4uo4otmjI5H6/rK4whyjV8cPz0wpxR/U9us1NbOH66VnXU3U6ra68HPOORPIUinMn2P5JzoN/zuXu048N+hqqBvoldo2JiVDXmLqlJsmCranpFho44= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ekoqdcg3; arc=none smtp.client-ip=74.125.82.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ekoqdcg3" Received: by mail-dl1-f46.google.com with SMTP id a92af1059eb24-1309f4ee97fso9074435c88.1 for ; Sat, 23 May 2026 02:43:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529422; x=1780134222; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OWos/AalbO5utJ9gJ5N4ZBriXQ8AKrI8xdpp7eapkB0=; b=ekoqdcg3qFcQv2k7VH66HoMr3dcs6Y5swBaHGaCM6e0+8Qyh1YbBBnJGBxvCTjOGVF Two0s7NpnbRAAgj5PvhB4s5MmIMDq3ZFVUAmXtxk0knfi4o/Cv0vt7vfljm2xpdCjAqU VDDytjlTP9PfQrs8cCz1aEA7ZDuutfrOoAVpPcJMVDCOuzxW6E2teB6/25j+iV6dmOmT n2Xzxpf0i3RYl/KruYoPMSde59TT7Ir1Yw1uFcv9xZYwWLF5LFuWJmAXWxgNlT26wITj TTLW4NQHWsg/Zo6+tTnF624epcCfixXx5/HIHOo7dVlNA91pvyDlsBWgr3OYRrIVsgVK Y4SQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529422; x=1780134222; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=OWos/AalbO5utJ9gJ5N4ZBriXQ8AKrI8xdpp7eapkB0=; b=dM1AJK3gqIc8XSuTPDq5YJTyo0fffz0QiOTLXdN4wid7m7yPeHWcGHKEcAp2ekaehB 8mWIVPms8JV7P0VmSAMBSUgVWjelRnMx3Jo0pT0rgEyMmbK3YIFc1AFOO/FCbOVvdst1 iPEX/if1bMLyX4161owSOF7v/sNp6gNwbf8zRmxpQ81WxREnVJB9uHWuYnIGFXUDZAOo wpRsbUgjBOnVh+ILyCIrOTiM0SkbMMiR59Rgnt/6yOC1OV8E9zPsnPVYpzh8MSoDNxEd BDrJI6eSVJu5TpmCsAgafKIawCulafFmV5gIUB60IQLVJfFNQ3JxuecRTFE8oSK+diwC PVRQ== X-Forwarded-Encrypted: i=1; AFNElJ8om6bb5YbJYwx1zAFFkGe9qfOv7PpzsJ+gIikNov8N/BpMImhCrmbE5MMKQlvwcWSWvOpjmmfYHGUR36k=@vger.kernel.org X-Gm-Message-State: AOJu0YwnKN+nvp0CqEsqnCzcTUcEQd7ooJDwM395S2NY7gmQzEfD41Pj /Ale68SyImM6ZzFGC64eNtWs3n6QMPKM2NW1P/5/wLsY7Sx7a8fR1d/O X-Gm-Gg: Acq92OG0T8FMsNna/olvyxMS2Lgty2i1IRreOY5q/Z7/10nEse37YMuukj5mkYCjMOn E/v66x6Ny+w3dnlY5GTIqAwZbOANB23hXaW1ciluTjeZ6IxfSISiyzAyUHKW9A4rmt2YcOQq0/T 35M0NOmCEkLRZKPHdN2z9tKWfLjFCZ18vbF8yKEU6QQPVY12yiov6ayZ6ObH34TuU6s0zTf6hI1 5MgRqkKRpT39WTerkZZm005XY3MR8qdnBIoxCBprUQWqRpb61dwaw7RW8MScsK25cUWehdeoDGh lC5zDJaB+60XL9cXoV2NSM2gCFsaFyuVN5u3Mxel/Oc7RGxV2CZ24b7MnsVdGxdSVrAIuttooW/ ouJTL3BKBb9OIqONRehslj074CJB2G13sDacY87C6zI8phE0mWi4Qq0VbnIlLgxRUkQf28bJWkT 9nkkz0M4owULG1lXflmj2/04vctVEJC+4PiQ9hScABCmehM2y+evC81S68hovrw+I7uQ09HGjYB tXPwFY= X-Received: by 2002:a05:701b:290a:b0:12d:de3e:be88 with SMTP id a92af1059eb24-1365fc6d91cmr1288421c88.36.1779529422248; Sat, 23 May 2026 02:43:42 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:41 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 03/31] cxl/cdat: Gather DSMAS data for DCD partitions Date: Sat, 23 May 2026 02:42:57 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Additional DCD partition (AKA region) information is contained in the DSMAS CDAT tables, including performance, read only, and shareable attributes. Match DCD partitions with DSMAS tables and store the meta data. Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] [jonathan: core/mbox.c: error if there are non-zero reserved bits in DSMAD handle in cxl_dc_check] --- drivers/cxl/core/cdat.c | 11 +++++++++++ drivers/cxl/core/mbox.c | 7 +++++++ drivers/cxl/cxlmem.h | 2 ++ include/cxl/cxl.h | 4 ++++ 4 files changed, 24 insertions(+) diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c index 5c9f07262513..c5f3d2ebea55 100644 --- a/drivers/cxl/core/cdat.c +++ b/drivers/cxl/core/cdat.c @@ -17,6 +17,7 @@ struct dsmas_entry { struct access_coordinate cdat_coord[ACCESS_COORDINATE_MAX]; int entries; int qos_class; + bool shareable; }; =20 static u32 cdat_normalize(u16 entry, u64 base, u8 type) @@ -74,6 +75,7 @@ static int cdat_dsmas_handler(union acpi_subtable_headers= *header, void *arg, return -ENOMEM; =20 dent->handle =3D dsmas->dsmad_handle; + dent->shareable =3D dsmas->flags & ACPI_CDAT_DSMAS_SHAREABLE; dent->dpa_range.start =3D le64_to_cpu((__force __le64)dsmas->dpa_base_add= ress); dent->dpa_range.end =3D le64_to_cpu((__force __le64)dsmas->dpa_base_addre= ss) + le64_to_cpu((__force __le64)dsmas->dpa_length) - 1; @@ -244,6 +246,7 @@ static void update_perf_entry(struct device *dev, struc= t dsmas_entry *dent, dpa_perf->coord[i] =3D dent->coord[i]; dpa_perf->cdat_coord[i] =3D dent->cdat_coord[i]; } + dpa_perf->shareable =3D dent->shareable; dpa_perf->dpa_range =3D dent->dpa_range; dpa_perf->qos_class =3D dent->qos_class; dev_dbg(dev, @@ -266,13 +269,21 @@ static void cxl_memdev_set_qos_class(struct cxl_dev_s= tate *cxlds, bool found =3D false; =20 for (int i =3D 0; i < cxlds->nr_partitions; i++) { + enum cxl_partition_mode mode =3D cxlds->part[i].mode; struct resource *res =3D &cxlds->part[i].res; + u8 handle =3D cxlds->part[i].handle; struct range range =3D { .start =3D res->start, .end =3D res->end, }; =20 if (range_contains(&range, &dent->dpa_range)) { + if (mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A && + dent->handle !=3D handle) + dev_warn(dev, + "Dynamic RAM perf mismatch; %pra (%u) vs %pra (%u)\n", + &range, handle, &dent->dpa_range, dent->handle); + update_perf_entry(dev, dent, &cxlds->part[i].perf); found =3D true; diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 71b29cd6abfe..f9a5e21f5d09 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1356,10 +1356,16 @@ static int cxl_dc_check(struct device *dev, struct = cxl_dc_partition_info *part_a { size_t blk_size =3D le64_to_cpu(dev_part->block_size); size_t len =3D le64_to_cpu(dev_part->length); + u32 handle =3D le32_to_cpu(dev_part->dsmad_handle); =20 part_array[index].start =3D le64_to_cpu(dev_part->base); part_array[index].size =3D le64_to_cpu(dev_part->decode_length); part_array[index].size *=3D CXL_CAPACITY_MULTIPLIER; + if (handle & ~0xFF) { + dev_warn(dev, "DSMAD handle 0x%x has non-zero reserved bits\n", handle); + return -EINVAL; + } + part_array[index].handle =3D handle; =20 /* Check partitions are in increasing DPA order */ if (index > 0) { @@ -1494,6 +1500,7 @@ int cxl_dev_dc_identify(struct cxl_mailbox *mbox, /* Return 1st partition */ dc_info->start =3D partitions[0].start; dc_info->size =3D partitions[0].size; + dc_info->handle =3D partitions[0].handle; dev_dbg(dev, "Returning partition 0 %zu size %zu\n", dc_info->start, dc_info->size); =20 diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 87386488ad10..cee936fb3d03 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -118,6 +118,7 @@ struct cxl_dpa_info { struct cxl_dpa_part_info { struct range range; enum cxl_partition_mode mode; + u8 handle; } part[CXL_NR_PARTITIONS_MAX]; int nr_partitions; }; @@ -818,6 +819,7 @@ int cxl_dev_state_identify(struct cxl_memdev_state *mds= ); struct cxl_dc_partition_info { size_t start; size_t size; + u8 handle; }; =20 int cxl_dev_dc_identify(struct cxl_mailbox *mbox, diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h index bb1df0cef863..51685a01d19c 100644 --- a/include/cxl/cxl.h +++ b/include/cxl/cxl.h @@ -122,12 +122,14 @@ struct cxl_register_map { * @coord: QoS performance data (i.e. latency, bandwidth) * @cdat_coord: raw QoS performance data from CDAT * @qos_class: QoS Class cookies + * @shareable: Is the range sharable */ struct cxl_dpa_perf { struct range dpa_range; struct access_coordinate coord[ACCESS_COORDINATE_MAX]; struct access_coordinate cdat_coord[ACCESS_COORDINATE_MAX]; int qos_class; + bool shareable; }; =20 enum cxl_partition_mode { @@ -141,11 +143,13 @@ enum cxl_partition_mode { * @res: shortcut to the partition in the DPA resource tree (cxlds->dpa_re= s) * @perf: performance attributes of the partition from CDAT * @mode: operation mode for the DPA capacity, e.g. ram, pmem, dynamic... + * @handle: DSMAS handle intended to represent this partition */ struct cxl_dpa_partition { struct resource res; struct cxl_dpa_perf perf; enum cxl_partition_mode mode; + u8 handle; }; =20 #define CXL_NR_PARTITIONS_MAX 3 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dy1-f174.google.com (mail-dy1-f174.google.com [74.125.82.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3FCD9385503 for ; Sat, 23 May 2026 09:43:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529426; cv=none; b=ebycxTMkE28Vhj2SeXR4tnnAxQ6YUjSvCwFYnm9eBCefNEsivxpnPRwwCevux/V6CWf6crBt7c/U6uuxQXBG0ob4IvgpBhP++41JamHt02FrENXkVzKvYthklXUSei4dOvs6KLA+mp46BoHGHIsw419dEAZjAZO08JS0CTegnhY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529426; c=relaxed/simple; bh=mimTWYFTaHY+FLSmEscwRLS2qecNH3JETIBfD4afc9g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k+n1thxybUoj6K7tM6+WB/ubs727ag89Rw8ostEKq4HyHMPnLQShM6hrLh2meg7J7X0Q84zRURUq603MwRr8Cg/zkjVpDTr+qve3nzSVmgJRYOBoP/GWDtWH+hsmA2DYJ+9QUjING+XZZiIQnv1du6YBo1OO19Y+YLwIW47IzG4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JyLJGjlN; arc=none smtp.client-ip=74.125.82.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JyLJGjlN" Received: by mail-dy1-f174.google.com with SMTP id 5a478bee46e88-3042a388168so1871293eec.1 for ; Sat, 23 May 2026 02:43:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529424; x=1780134224; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=snhhVzpNXLnSXNel0j3XIF8QPv37i0IqA2Elwz6NcMg=; b=JyLJGjlN9X1A6O8HY253x6SRS7KuoeOyc/Hj7DrMbsHappKVNozN4SyMKsYfCEUXuu /GXO67ZFG/HphCHoxfl0itJ2O5TEdt722iw8V9/GRt3IXywMDUjW0P1IqfLkXRZYPlnd uo9TVdyaoRCLxVBNLjW0N38uBsldANdCP6UlXIo0HOH/Gx6e8IwppfkoZVWtaM0YK1Yv 9gR5dIEnAvMNCLs5LMcPDYXuZPywMQKzikH8AU2LkXB8pZgqqEmyL8SUCqzRp2ZzkMSw xJMbR/CwPIiyptlb9VN9UrGZK+smI28Y1Iat7jvbBCjEeYfJ0iF1V83Y+v9i2dqUKjun SeKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529424; x=1780134224; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=snhhVzpNXLnSXNel0j3XIF8QPv37i0IqA2Elwz6NcMg=; b=o1LpiAc8rMDvP3xvjdjJniz9IT6OkXAXYpVaUEA8mcqXx7Q8zdkIwygQLDBoGzd848 i+bVezFbOcdLIJudYpUkW8zhLXaRO6kACJGjZFvLl7x5+DO3HlveNBaAADh0rTshVjLz BSWV45zyy4QvKu0hS7bwiAdxZgaILwPaPa676UqRYGhf4zSZ0ggKF5ODrtbYtYm5uIDr 8AkG+5S+j8NC5M8UgOSir/ZoHFQIpeuQwEeJQ6qbgLKX/rHs6Vhgt+1mi8tcBn6N3TlV jht5Ajd0hTM66PCYKd+werlpx4zIDTWeVjalSkrQKwbDk7qIOkJDI3EZ457SZdAcVV3N 09Ew== X-Forwarded-Encrypted: i=1; AFNElJ/g2hRH46I2DDSmuiBfZlq0GEuLlNJlmOBlgHR+lOu5Mk+goFGX9ZOZuz5uYVMs4yRmThdIWdb3so6efJc=@vger.kernel.org X-Gm-Message-State: AOJu0YyC55okeQnddRBIotSsJNntDHpViyLNTZf3vRQ6Tb4R0xdwpn3i L5pCeOMqyFToF0XVUjmI34I/X+TLePpwIe7UC9+lWxzqaZinPqO2ZMEg X-Gm-Gg: Acq92OEZoRtEgXB4+OHDdeSEL8voLc/JMA9wjreTIiRWSy8Chge7Q41jJDRHqipL5Sp UdGZC5vKpLqqsnuxS4x2oEvElhM9H0bMHSEYtj9lHHx0KY/926V9Tfob33I4GPp0hPVS/hCot6p 8VvFgVScQr+MBmXWp2z/47hmmoj19/9ldBceBf9LuPSzLSfAAUOxgm9iPgMThAgBCYG/Gwz56ms Q0EnxNa1221OOADgZgQebmO5I8tw97I4XK+QfpvwI3O+sf+EeIFxRIEkp1D2k98gcnJfr0Vty0q Diil8vUd3JEYM7ImUI41w7HFYxaSEdfVKnoy+EmQB7AEMVh4JBh5B4wvppa5tf6hoWRBA0YP55p 0rKuOxoTA6XBIA6PMZwni4ZiAnoO1dIkWEouWFRCtvAaEXoQRbIC5LtmDzUmYQn6w/k2XcmYIQi ND1eCgVVNHD2zJk8wYtgdMYt8Eoe4aKFXlCPrToarKyJT5AR/lS11C62miLbnzpAXeHZD5HUh71 P4zMLo= X-Received: by 2002:a05:7022:609c:b0:11a:fb0a:ceca with SMTP id a92af1059eb24-13633eb60b4mr3619709c88.16.1779529424123; Sat, 23 May 2026 02:43:44 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:43 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 04/31] cxl/core: Enforce partition order/simplify partition calls Date: Sat, 23 May 2026 02:42:58 -0700 Message-ID: <22ae445b8a99d26299520e2429c5bf4e64b0d9e6.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Device partitions have an implied order which is made more complex by the addition of a dynamic partition. Remove the ram special case information calls in favor of generic calls with a check ahead of time to ensure the preservation of the implied partition order. Signed-off-by: Ira Weiny --- Changes:: [anisa: rebase] [davidlohr: core/hdm.c: return -EINVAL instead of 0 in cxl_dpa_setup if partitions are out of order] --- drivers/cxl/core/hdm.c | 11 ++++++++++- drivers/cxl/core/memdev.c | 32 +++++++++----------------------- drivers/cxl/cxlmem.h | 9 +++------ drivers/cxl/mem.c | 2 +- 4 files changed, 23 insertions(+), 31 deletions(-) diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c index 28974adaab75..7a5812971f8f 100644 --- a/drivers/cxl/core/hdm.c +++ b/drivers/cxl/core/hdm.c @@ -464,6 +464,7 @@ static const char *cxl_mode_name(enum cxl_partition_mod= e mode) int cxl_dpa_setup(struct cxl_dev_state *cxlds, const struct cxl_dpa_info *= info) { struct device *dev =3D cxlds->dev; + int i; =20 guard(rwsem_write)(&cxl_rwsem.dpa); =20 @@ -476,9 +477,17 @@ int cxl_dpa_setup(struct cxl_dev_state *cxlds, const s= truct cxl_dpa_info *info) return 0; } =20 + /* Verify partitions are in expected order. */ + for (i =3D 1; i < info->nr_partitions; i++) { + if (cxlds->part[i].mode < cxlds->part[i-1].mode) { + dev_err(dev, "Partition order mismatch\n"); + return -EINVAL; + } + } + cxlds->dpa_res =3D DEFINE_RES_MEM(0, info->size); =20 - for (int i =3D 0; i < info->nr_partitions; i++) { + for (i =3D 0; i < info->nr_partitions; i++) { const struct cxl_dpa_part_info *part =3D &info->part[i]; int rc; =20 diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 80e65690eb77..71602820f896 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -75,20 +75,12 @@ static ssize_t label_storage_size_show(struct device *d= ev, } static DEVICE_ATTR_RO(label_storage_size); =20 -static resource_size_t cxl_ram_size(struct cxl_dev_state *cxlds) -{ - /* Static RAM is only expected at partition 0. */ - if (cxlds->part[0].mode !=3D CXL_PARTMODE_RAM) - return 0; - return resource_size(&cxlds->part[0].res); -} - static ssize_t ram_size_show(struct device *dev, struct device_attribute *= attr, char *buf) { struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); struct cxl_dev_state *cxlds =3D cxlmd->cxlds; - unsigned long long len =3D cxl_ram_size(cxlds); + unsigned long long len =3D cxl_part_size(cxlds, CXL_PARTMODE_RAM); =20 return sysfs_emit(buf, "%#llx\n", len); } @@ -101,7 +93,7 @@ static ssize_t pmem_size_show(struct device *dev, struct= device_attribute *attr, { struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); struct cxl_dev_state *cxlds =3D cxlmd->cxlds; - unsigned long long len =3D cxl_pmem_size(cxlds); + unsigned long long len =3D cxl_part_size(cxlds, CXL_PARTMODE_PMEM); =20 return sysfs_emit(buf, "%#llx\n", len); } @@ -424,10 +416,11 @@ static struct attribute *cxl_memdev_attributes[] =3D { NULL, }; =20 -static struct cxl_dpa_perf *to_pmem_perf(struct cxl_dev_state *cxlds) +static struct cxl_dpa_perf *part_perf(struct cxl_dev_state *cxlds, + enum cxl_partition_mode mode) { for (int i =3D 0; i < cxlds->nr_partitions; i++) - if (cxlds->part[i].mode =3D=3D CXL_PARTMODE_PMEM) + if (cxlds->part[i].mode =3D=3D mode) return &cxlds->part[i].perf; return NULL; } @@ -438,7 +431,7 @@ static ssize_t pmem_qos_class_show(struct device *dev, struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); struct cxl_dev_state *cxlds =3D cxlmd->cxlds; =20 - return sysfs_emit(buf, "%d\n", to_pmem_perf(cxlds)->qos_class); + return sysfs_emit(buf, "%d\n", part_perf(cxlds, CXL_PARTMODE_PMEM)->qos_c= lass); } =20 static struct device_attribute dev_attr_pmem_qos_class =3D @@ -450,20 +443,13 @@ static struct attribute *cxl_memdev_pmem_attributes[]= =3D { NULL, }; =20 -static struct cxl_dpa_perf *to_ram_perf(struct cxl_dev_state *cxlds) -{ - if (cxlds->part[0].mode !=3D CXL_PARTMODE_RAM) - return NULL; - return &cxlds->part[0].perf; -} - static ssize_t ram_qos_class_show(struct device *dev, struct device_attribute *attr, char *buf) { struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); struct cxl_dev_state *cxlds =3D cxlmd->cxlds; =20 - return sysfs_emit(buf, "%d\n", to_ram_perf(cxlds)->qos_class); + return sysfs_emit(buf, "%d\n", part_perf(cxlds, CXL_PARTMODE_RAM)->qos_cl= ass); } =20 static struct device_attribute dev_attr_ram_qos_class =3D @@ -499,7 +485,7 @@ static umode_t cxl_ram_visible(struct kobject *kobj, st= ruct attribute *a, int n) { struct device *dev =3D kobj_to_dev(kobj); struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); - struct cxl_dpa_perf *perf =3D to_ram_perf(cxlmd->cxlds); + struct cxl_dpa_perf *perf =3D part_perf(cxlmd->cxlds, CXL_PARTMODE_RAM); =20 if (a =3D=3D &dev_attr_ram_qos_class.attr && (!perf || perf->qos_class =3D=3D CXL_QOS_CLASS_INVALID)) @@ -518,7 +504,7 @@ static umode_t cxl_pmem_visible(struct kobject *kobj, s= truct attribute *a, int n { struct device *dev =3D kobj_to_dev(kobj); struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); - struct cxl_dpa_perf *perf =3D to_pmem_perf(cxlmd->cxlds); + struct cxl_dpa_perf *perf =3D part_perf(cxlmd->cxlds, CXL_PARTMODE_PMEM); =20 if (a =3D=3D &dev_attr_pmem_qos_class.attr && (!perf || perf->qos_class =3D=3D CXL_QOS_CLASS_INVALID)) diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index cee936fb3d03..10175ca3b7ee 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -383,14 +383,11 @@ struct cxl_security_state { =20 #define CXL_MAX_DC_PARTITIONS 8 =20 -static inline resource_size_t cxl_pmem_size(struct cxl_dev_state *cxlds) +static inline resource_size_t cxl_part_size(struct cxl_dev_state *cxlds, + enum cxl_partition_mode mode) { - /* - * Static PMEM may be at partition index 0 when there is no static RAM - * capacity. - */ for (int i =3D 0; i < cxlds->nr_partitions; i++) - if (cxlds->part[i].mode =3D=3D CXL_PARTMODE_PMEM) + if (cxlds->part[i].mode =3D=3D mode) return resource_size(&cxlds->part[i].res); return 0; } diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c index fcffe24dcb42..f19e08279ec7 100644 --- a/drivers/cxl/mem.c +++ b/drivers/cxl/mem.c @@ -114,7 +114,7 @@ static int cxl_mem_probe(struct device *dev) return -ENXIO; } =20 - if (cxl_pmem_size(cxlds) && IS_ENABLED(CONFIG_CXL_PMEM)) { + if (cxl_part_size(cxlds, CXL_PARTMODE_PMEM) && IS_ENABLED(CONFIG_CXL_PMEM= )) { rc =3D devm_cxl_add_nvdimm(dev, parent_port, cxlmd); if (rc) { if (rc =3D=3D -ENODEV) --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f53.google.com (mail-dl1-f53.google.com [74.125.82.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2E7238734D for ; Sat, 23 May 2026 09:43:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529428; cv=none; b=nEv2Yj+Z/mnlNAWRXdQgFKkj8HzSrjjMO8wWrNN1JprldzR81DdRFU58SOLRNQqZARYt14cUyXlk5mKB/iYLITFcxc0HQ8P78+7VmKmm+6C4jHrRqqfxnOvArht2t/YvVp/ANKB665mD6osV+poE2teL5ma2ZNrY1pA9rn9TNu0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529428; c=relaxed/simple; bh=lUv0SjAAaw3E/kgV1KeDK72Wh6iLEtGftRf81Pn66SM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=C4vDYR+VxUFM9YKP+BpYfe/pi6pJWwOtCl72yfSXh9wedaT3zIVLq/y++uTrcByKBLC6DghgDBkaPH9l6qF+uUAn6F1BooASnm2OTnYGt6ucRf2fzstN0KVTtwufvr51eVM7w5+l8Y/7hklOebP8Nqn84FAbgwpmB5Ny1dCCBtc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=df+sGys1; arc=none smtp.client-ip=74.125.82.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="df+sGys1" Received: by mail-dl1-f53.google.com with SMTP id a92af1059eb24-1329fc4bf77so2963474c88.1 for ; Sat, 23 May 2026 02:43:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529426; x=1780134226; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=w6Dz9TYOBwWg06fFUo8qqEZEgSQ6r/8vuvT8PbFJpKY=; b=df+sGys1ryH7n92a6gZy18seyR/BDtNlIMQgrh+qGruMH00QyOjHHL2yC1AIoDQ4yG qg33blVD6/NzWdj1p4ZOfgJEz92vBs/9p4+8uocSmN3vQqY3QUyUZ2iu+skpi4sYtQSL oxMGIboZC0Oyv446KNZEYOmSZk9696XwJYim6ajYaTbIX02Zi6pToUmvuSFfZwaarGXn STdg5vIVxC/GOoPcVe3VDJhnFEC6PIJsc2ENoTI0RHa9r5UW5MuRiOBRuUPzWiaaK5AE rSWqVEb2Fz+YtBkxMGpW/Zfg04AG/WW4CiKgcwO6mgqiIYyAegUwdzSc4rxQptVmOOMf ZphQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529426; x=1780134226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=w6Dz9TYOBwWg06fFUo8qqEZEgSQ6r/8vuvT8PbFJpKY=; b=B9UbFqu0/OitnaZrFVhyjI8hWB5VGoEYf/tYbZPFbjCGUTeFNlHbdIfJobo6EjyMg2 pCZqDggDLuuFEp8QGNTmZEeeVKbN3QK5smVyhcvyinppzgUWJjH9jmXwIoY7BJgWrkf1 R6/8eTpg5IuCCkU5HSKL74wIr6euHJjL+8gwlz8RFV9efgCJMt2CQi57mwIiOl29GAPx NvvnTmI8bp80vGej2fKX6F2MnBJbtHzRFyW5Dr/i10JtcDVCtmOhUHr7p2DMo6uTSvoE BiqaB4pdyYU6h7C53n0D2yY7jZZE1vvJRtykHEOtlw522lxxV3Xf0ETqzwOlWnuLcO+5 EzKg== X-Forwarded-Encrypted: i=1; AFNElJ/UcUW7B68X81u+3ZMx0fydgQCS7RdteYyucKg9OO3MLswtbIsalZMJ2/czInmOq+Tme9MgMNvuA5ye7Fc=@vger.kernel.org X-Gm-Message-State: AOJu0Yz2gSF7Xoe417sywp1Jgf664Vxsj+F9vtJMEOMtB2L7lanBNNt8 PXdY/ObyvVaPMjYfl1MUPpSRRad66BAwioAp3bntlYs2KILTAc8XWgeY X-Gm-Gg: Acq92OEypZeZWmlY7dPEqL52SSAGSVi8a4fmgu48BqmgNzpNAmbBZTFWs+xd9iqTHcS 9fw73lvBAZYCtogAfhfjA3so+c81kj/GZirP7GzKFjIf5s6O5G5nF2SX+Hppq1P4JPHJzizXMOY zRYBCKR1A2/z3WyNHDh+X8Up4IreeCvu5RofVNDZQDuCbb9mIxIz3815AUiyBwk6yLZZLzE8zo/ aVufN+NAp3BbRUBXUjSYnZfqqVunSb7Xd52DvwPUd/UKBB6juOmOp/woyk7En7FUdgJL5xgb2iu rn+K6mwO3/hH7UhisCPZ1n+HdZyYuWfjOKFG/JUyF9DBq+xWYimn4vQewhTP1YG/G1doItWD12+ Iz+0TXPArS4tlcLWCL7JWv9GxJPQrQdBzlEAiJ6HkGVAjCDqgw9967yGVIi2tpfbdyJgcydqvXG 8acHJ/d2LA2TTrOUeCczRK7xlOWdj4OqCoI93Ll/GB6TTO/7pKd18hxizpTxZijHNqaOodBjCKT 4lPSqc= X-Received: by 2002:a05:7022:258a:b0:12d:de3e:cc02 with SMTP id a92af1059eb24-1365fd80b13mr2830332c88.41.1779529425905; Sat, 23 May 2026 02:43:45 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:45 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 05/31] cxl/mem: Expose dynamic ram A partition in sysfs Date: Sat, 23 May 2026 02:42:59 -0700 Message-ID: <45bc277b11c1aabf495132925c0d75c78e3b5a8a.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny To properly configure CXL regions user space will need to know the details of the dynamic ram partition. Expose the first dynamic ram partition through sysfs. Signed-off-by: Ira Weiny --- Changes: [anisa: Update kernel version to 7.0] [davidlohr: Remove "persistent" from description of /sys/bus/cxl/devices/memX/dynamic_ram_a/qos_class] --- Documentation/ABI/testing/sysfs-bus-cxl | 24 +++++++++++ drivers/cxl/core/memdev.c | 57 +++++++++++++++++++++++++ 2 files changed, 81 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/te= sting/sysfs-bus-cxl index 16a9b3d2e2c0..3d95c325f6e0 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -89,6 +89,30 @@ Description: and there are platform specific performance related side-effects that may result. First class-id is displayed. =20 +What: /sys/bus/cxl/devices/memX/dynamic_ram_a/size +Date: May, 2025 +KernelVersion: v7.0 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) The first Dynamic RAM partition capacity as bytes. + + +What: /sys/bus/cxl/devices/memX/dynamic_ram_a/qos_class +Date: May, 2025 +KernelVersion: v7.0 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) For CXL host platforms that support "QoS Telemmetry" + this attribute conveys a comma delimited list of platform + specific cookies that identifies a QoS performance class + for the partition of the CXL mem device. These + class-ids can be compared against a similar "qos_class" + published for a root decoder. While it is not required + that the endpoints map their local memory-class to a + matching platform class, mismatches are not recommended + and there are platform specific performance related + side-effects that may result. First class-id is displayed. + =20 What: /sys/bus/cxl/devices/memX/serial Date: January, 2022 diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 71602820f896..064cfd628577 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -101,6 +101,19 @@ static ssize_t pmem_size_show(struct device *dev, stru= ct device_attribute *attr, static struct device_attribute dev_attr_pmem_size =3D __ATTR(size, 0444, pmem_size_show, NULL); =20 +static ssize_t dynamic_ram_a_size_show(struct device *dev, struct device_a= ttribute *attr, + char *buf) +{ + struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); + struct cxl_dev_state *cxlds =3D cxlmd->cxlds; + unsigned long long len =3D cxl_part_size(cxlds, CXL_PARTMODE_DYNAMIC_RAM_= A); + + return sysfs_emit(buf, "%#llx\n", len); +} + +static struct device_attribute dev_attr_dynamic_ram_a_size =3D + __ATTR(size, 0444, dynamic_ram_a_size_show, NULL); + static ssize_t serial_show(struct device *dev, struct device_attribute *at= tr, char *buf) { @@ -443,6 +456,25 @@ static struct attribute *cxl_memdev_pmem_attributes[] = =3D { NULL, }; =20 +static ssize_t dynamic_ram_a_qos_class_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); + struct cxl_dev_state *cxlds =3D cxlmd->cxlds; + + return sysfs_emit(buf, "%d\n", + part_perf(cxlds, CXL_PARTMODE_DYNAMIC_RAM_A)->qos_class); +} + +static struct device_attribute dev_attr_dynamic_ram_a_qos_class =3D + __ATTR(qos_class, 0444, dynamic_ram_a_qos_class_show, NULL); + +static struct attribute *cxl_memdev_dynamic_ram_a_attributes[] =3D { + &dev_attr_dynamic_ram_a_size.attr, + &dev_attr_dynamic_ram_a_qos_class.attr, + NULL, +}; + static ssize_t ram_qos_class_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -519,6 +551,29 @@ static struct attribute_group cxl_memdev_pmem_attribut= e_group =3D { .is_visible =3D cxl_pmem_visible, }; =20 +static umode_t cxl_dynamic_ram_a_visible(struct kobject *kobj, struct attr= ibute *a, int n) +{ + struct device *dev =3D kobj_to_dev(kobj); + struct cxl_memdev *cxlmd =3D to_cxl_memdev(dev); + struct cxl_dpa_perf *perf =3D part_perf(cxlmd->cxlds, CXL_PARTMODE_DYNAMI= C_RAM_A); + + if (a =3D=3D &dev_attr_dynamic_ram_a_qos_class.attr && + (!perf || perf->qos_class =3D=3D CXL_QOS_CLASS_INVALID)) + return 0; + + if (a =3D=3D &dev_attr_dynamic_ram_a_size.attr && + (!cxl_part_size(cxlmd->cxlds, CXL_PARTMODE_DYNAMIC_RAM_A))) + return 0; + + return a->mode; +} + +static struct attribute_group cxl_memdev_dynamic_ram_a_attribute_group =3D= { + .name =3D "dynamic_ram_a", + .attrs =3D cxl_memdev_dynamic_ram_a_attributes, + .is_visible =3D cxl_dynamic_ram_a_visible, +}; + static umode_t cxl_memdev_security_visible(struct kobject *kobj, struct attribute *a, int n) { @@ -547,6 +602,7 @@ static const struct attribute_group *cxl_memdev_attribu= te_groups[] =3D { &cxl_memdev_attribute_group, &cxl_memdev_ram_attribute_group, &cxl_memdev_pmem_attribute_group, + &cxl_memdev_dynamic_ram_a_attribute_group, &cxl_memdev_security_attribute_group, NULL, }; @@ -555,6 +611,7 @@ void cxl_memdev_update_perf(struct cxl_memdev *cxlmd) { sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_ram_attribute_group); sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_pmem_attribute_group); + sysfs_update_group(&cxlmd->dev.kobj, &cxl_memdev_dynamic_ram_a_attribute_= group); } EXPORT_SYMBOL_NS_GPL(cxl_memdev_update_perf, "CXL"); =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dy1-f179.google.com (mail-dy1-f179.google.com [74.125.82.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 424D3386C31 for ; Sat, 23 May 2026 09:43:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529430; cv=none; b=bdYi96IJFikAXHu1bsT0qvJ77mlZ/7YL0cJLZ7Nnh9IqCW89yTkoXp27lfZAM2bOcsRTs93sywaPtNDqiRF++BmWAF0fSyd87uj1iELFtjAVNOxursPwn+Y6m7UZMI+P2AB+RJlSGtCfwoX5Uy4Apynv07rnc/iK7zdlcWCyUdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529430; c=relaxed/simple; bh=bhkVV5eWL9q2QBEtj+qST0UkfpKTX/AYf0RWbT4RAzI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HDQilpiG4EYk6ipDIVhnd2twYdEaPqlUFGZWkalbDM2uqBaDu3y2m5dSfs4vn+90UYwMJP1CXRNW849aYuAvVOH1kIJU5YD/jVO46+Ky72jEPmYRQeJIo8nByOBRpP8Ap7RFlmQva2OpWI+7kT2BYlsKHgg/9yPyTbMvv50LZuo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kJ1R4Ufu; arc=none smtp.client-ip=74.125.82.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kJ1R4Ufu" Received: by mail-dy1-f179.google.com with SMTP id 5a478bee46e88-3044857f09aso2978998eec.1 for ; Sat, 23 May 2026 02:43:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529427; x=1780134227; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7WfczXqcXBpP3sctFNC30HRypl1xDvSsdjoS+nUQlBk=; b=kJ1R4Ufun7G3m0N6kv1r1+4lbuilaG7Adolk5GIUjQ9p32bZPDPH0wGc9AOafi9wbO fWgjao/xSyAkqgVKx+S0hl9wx7GItmWzj3b2mgS7X/Fyp9C4eDvkhWclt4DrFqE0u3Ml 4bx7K1+lIneusKMiGqIg2euxq2ZVsZS0bCo5uA2PuEATeshrWoXmn4wka2myDiYVsqoo stR3ma3SlX7FKi7vFgLFYddwf4wsVOFvCIIETmdN393PHlHktjTTIRwshjDFUd8KpIYO ogjC8itGvM9yP808jdFj7HRk+moqmiMAsfTllV7W/ckFihtSF8/Yla4Wm8z/Q4fTU8Uw kPOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529427; x=1780134227; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7WfczXqcXBpP3sctFNC30HRypl1xDvSsdjoS+nUQlBk=; b=otlBFVmkZ+G4+sRpuV9gyTmxqnB3mdrdAKuGJ/pjlKVY0GiL5+VsCRcCfgfJBANxIZ qLgnvtADgKXNnSBTAlngRl9p26DcelDhrvi0jFevRAO8PGulyQSriMNLjorO6LGL/NCb dbLsdk0nCjzdCMKnMiW8IFQTUvgiF5b0ctTIFR5bM2/Xz1O0uL0MWusl4+fxKHrKEHkQ d7TwmDFMgmWgdi2zjbViLkoHHmgUxzv1sfl4bjYu/MNous8AilVJ3S3/+C2Am+YbCAV4 xFRFGl3zVubQg2T1WAeoNmRkgFFNSEamLjstnSfUH/EZSICBDwjwzMPbczp5V7nNSgGK kXDA== X-Forwarded-Encrypted: i=1; AFNElJ+j8l7ufssmq8Q7YnN/Nx4+AZVJUiJPkdNy2pKO7KJ8HKAn+h5/2diAPK90b35E9qSJbJexCyU/4WDWs28=@vger.kernel.org X-Gm-Message-State: AOJu0Yx+5NNmiuDNef2yUbzeALY861dpkbxseATKNl9whZqdfNeCuj7q f2eLrUt4twFLjbfXzG/xqKkdeIBsGWiBfVh4u9iXErjxibWR7UNCOiH8 X-Gm-Gg: Acq92OFMefirgfDHCiwf2brsBEKwsbx3vuSCHXt6S/q+NFfznrHO2go7OlaeQ3MWXCi 4DK8JrNT9iwGGDh/mtRFtzeLqVSQYlB9bk9nUFHdPAdXdmKh0dNPaNQWtAwI6IaOFK1iLS2KR2W zu+IQK24wkVYXWA6KIS63Sz+/CtwRDlyTwzOPblXoMTq3l2L0qlv+S6GunB5SZUXGrzTKLR++4X tPl8cE06vasz+WW9ZBsMydb5yQI2176lRL2zgrmTE7/qFmtv1MQ2r44QfyTQbdTZE3G+9XcFE7y sM3NT5eC90PelSQ6A68VS9V7X2TeszMlh5PFG5BavzKpP8d/FPiUeWDcrDCnP+M3Saded7GRfHx z0xBLZHjkxPcbGbej2Um/BKeNDPXKwSRy00QforIfi8pxw5IGdUUNgQXli4kzz3ZbRu1d/acVfo WUOwhKaYa+X2ntiX9w/jH2GzVTRbjzNQF2X6wknzRDbqz20yT+85LpLvNZoDdCpL7XDIg2N5lrl vPH/d/d9OcgIMoyae0W+CHLSW5z X-Received: by 2002:a05:7022:6893:b0:12d:f0b1:75de with SMTP id a92af1059eb24-1365fb424admr2816473c88.22.1779529427374; Sat, 23 May 2026 02:43:47 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:47 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 06/31] cxl/port: Add 'dynamic_ram_a' to endpoint decoder mode Date: Sat, 23 May 2026 02:43:00 -0700 Message-ID: <58e5e5007cd11e0b8e65016f126144f187badb39.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Endpoints can now support a single dynamic ram partition following the persistent memory partition. Expand the mode to allow a decoder to point to the first dynamic ram partition. Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] --- Documentation/ABI/testing/sysfs-bus-cxl | 18 +++++++++--------- drivers/cxl/core/port.c | 4 ++++ 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/te= sting/sysfs-bus-cxl index 3d95c325f6e0..c604c7ca6432 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -358,22 +358,22 @@ Description: =20 =20 What: /sys/bus/cxl/devices/decoderX.Y/mode -Date: May, 2022 -KernelVersion: v6.0 +Date: May, 2022, May 2025 +KernelVersion: v6.0, v6.16 (dynamic_ram_a) Contact: linux-cxl@vger.kernel.org Description: (RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it translates from a host physical address range, to a device local address range. Device-local address ranges are further - split into a 'ram' (volatile memory) range and 'pmem' - (persistent memory) range. The 'mode' attribute emits one of - 'ram', 'pmem', or 'none'. The 'none' indicates the decoder is - not actively decoding, or no DPA allocation policy has been - set. + split into a 'ram' (volatile memory) range, 'pmem' (persistent + memory), and 'dynamic_ram_a' (first Dynamic RAM) range. The + 'mode' attribute emits one of 'ram', 'pmem', 'dynamic_ram_a' or + 'none'. The 'none' indicates the decoder is not actively + decoding, or no DPA allocation policy has been set. =20 'mode' can be written, when the decoder is in the 'disabled' - state, with either 'ram' or 'pmem' to set the boundaries for the - next allocation. + state, with either 'ram', 'pmem', or 'dynamic_ram_a' to set the + boundaries for the next allocation. =20 =20 What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 0c5957d1d329..a7f71f36531f 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -128,6 +128,7 @@ static DEVICE_ATTR_RO(name) =20 CXL_DECODER_FLAG_ATTR(cap_pmem, CXL_DECODER_F_PMEM); CXL_DECODER_FLAG_ATTR(cap_ram, CXL_DECODER_F_RAM); +CXL_DECODER_FLAG_ATTR(cap_dynamic_ram_a, CXL_DECODER_F_RAM); CXL_DECODER_FLAG_ATTR(cap_type2, CXL_DECODER_F_TYPE2); CXL_DECODER_FLAG_ATTR(cap_type3, CXL_DECODER_F_TYPE3); CXL_DECODER_FLAG_ATTR(locked, CXL_DECODER_F_LOCK); @@ -222,6 +223,8 @@ static ssize_t mode_store(struct device *dev, struct de= vice_attribute *attr, mode =3D CXL_PARTMODE_PMEM; else if (sysfs_streq(buf, "ram")) mode =3D CXL_PARTMODE_RAM; + else if (sysfs_streq(buf, "dynamic_ram_a")) + mode =3D CXL_PARTMODE_DYNAMIC_RAM_A; else return -EINVAL; =20 @@ -327,6 +330,7 @@ static struct attribute_group cxl_decoder_base_attribut= e_group =3D { static struct attribute *cxl_decoder_root_attrs[] =3D { &dev_attr_cap_pmem.attr, &dev_attr_cap_ram.attr, + &dev_attr_cap_dynamic_ram_a.attr, &dev_attr_cap_type2.attr, &dev_attr_cap_type3.attr, &dev_attr_target_list.attr, --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dy1-f181.google.com (mail-dy1-f181.google.com [74.125.82.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D379D38734A for ; Sat, 23 May 2026 09:43:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529434; cv=none; b=FPP11ebNgbGgI6wcsIRd56kGnUQQRXUDG/U66sDNTVx6GXr8+5lQbRYV9345RuYJvNIiTEz6w6g6gn1fupRAG7agOjoAn+3YQGB2APIrC4bJB0DjzUTKtcKw1/5TzsjRGQ14Ed4p/q8vJB5fiX2xoY/Vp+LyEtnxEGqovtL0L20= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529434; c=relaxed/simple; bh=bnNosoa4wdcuqFo/b0ROOHywKfUx7eAMgCsyxmkMAUY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EbQ/PxBZfsrVvyyO/dQnreC8DapQeivf1av7aUGmrr08NKmG47bVrrqHvzXAfvaZ8sH22dsG2FN4jEseXXs1erLjLs5RKAybFsW0/8BkXdA39z7tizKFbAReGOkN+K3wFtwdcXl4DQIprwDFzNlytVftGFrMEBE0fJ16x2e1JmM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TgFuuEVj; arc=none smtp.client-ip=74.125.82.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TgFuuEVj" Received: by mail-dy1-f181.google.com with SMTP id 5a478bee46e88-3025d725a05so19383931eec.1 for ; Sat, 23 May 2026 02:43:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529429; x=1780134229; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=j6DG2kcCXKCP6QJoWNPoCD9fZeNb6P0SZLPXjdbQtmw=; b=TgFuuEVjzRKXfcvAKjhWbUh0oZytjz42F8vWyKoH7jJemFD2JoUs87TsAGZLzV8NVt N3yMNbDRCGjtfldeJ2IiNqRy4r3buxSs6XyRaRWh4AFILMYIIXbsZZZmH5MjNpvzPY3b Rwi1byGuA+VIcModepyYpMBY7bAs52ur5J+v3jnPQMo5Y905ttTVwZbul4mjsMGufYkt qr3vsuwPs5uYgwePjL3FLUCieU1FQLYiCWeqdhd0/y3Klzc1HwiYR33HYcdLceqiGtox 3WdLZcEr+PrjYT0g35Z+sDnh0p43HoioEJuTjT4M/4ONQjoBNWqWga+UX5EuNsSprG+o 1Sjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529429; x=1780134229; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=j6DG2kcCXKCP6QJoWNPoCD9fZeNb6P0SZLPXjdbQtmw=; b=BH1mcSbYnpfkPAV1L5qll0Tm+bBUDvFd8uJGfhe5XfAwMEGe+W9J0Ymef1HGBoxn60 KLRmqpu0DkrfuXokhzXXaq60OH+lv2J51Lz3WVqDHJhRleXWl7/KFobxsma8OC7dU3dT r+vIjT2EbH3O9jiPcQub7bnLNKqVSQe847iN+U+bP7h0RS/4CG4iEZOG0UpZ7kECvfPr cqNVDnq8tbJ5a7GJs3UAnRphrqMvIvKRxU/LwgtWPRaXfHVoJrf024Gn9AH19u+1MFQm Z2eMDyxUAriUQ8I05Mh6i2mQuOG+sa+C+OjmO6GfBcQIbxx2LHoapReZzS/KRRT3cDFl 37yA== X-Forwarded-Encrypted: i=1; AFNElJ+c1JCQ9FJwjkocQx5daMfN766P8mVA+FvrkYKmkUsEogf+SvfIy9Ypjvf4aS9MabhGUvZxabZ4ZrBtpqQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwtTclq5V+qzkrFzpFprW0L4uaio7Ec6yJGnI3Npuo8Vyx2GyY0 I+nUck2yTFEl+dn1Pe0y6jjzQdoZZLdd7StvAk+E+aL/FcXqSLsAV2Ey X-Gm-Gg: Acq92OH8c3Hi9t+UOrtT552K2GG2FPaJdr7GwTXbYg4WWwzslRZ6kzSlR0PvqaXnrm0 omb07oAD5Xd5NpPi4+Cu/Y1oo14nE9elooAAU00+5JKwgb7CkeVErizYOOSxvxOQThICpXcCve0 2yDoVXDfCkqhazh5kwD+U3zG2e9m2/+UIw5yCMj/NMjQ8DTR3Ibc+Az59tlQnOcM6bkLlbFg8cM 39yzzXTZJgNy9DvChEVxk58S3ipFa08XYjfi2LWiAbY2Q1Jl2oM7+eWWivXvovmgroW2O46OOmu 96OuUY7ZxQ+DhInYtF/I2mRqz1nuSUmX58rCfJaqbEVE5HlIhji1ZS5AZ473Zo9z5F/IY5GgOlX ddV4HKi1o4nUUXD63PwnqtobxQ+taFxXey6AwkQllTLJifuPk469yQ87XrYuwqIxhHI5alU0Pq2 /+a8Ko/0I5DcfYVNL1+OH/ofMI65wmk+oGI3ffsA3h3D2Mrw1HhAcmdv6s7QtRgn9+iaPzLqj2F fwWlPw= X-Received: by 2002:a05:7022:61b:b0:12d:b205:c737 with SMTP id a92af1059eb24-1365f91bdb4mr2813147c88.17.1779529428944; Sat, 23 May 2026 02:43:48 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:48 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny Subject: [PATCH v10 07/31] cxl/region: Add DC DAX region support Date: Sat, 23 May 2026 02:43:01 -0700 Message-ID: <9f0e0b3deeb1825ad113d7aebe7056dcf2bbc5f9.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny DC DAX regions must allow memory to be added or removed dynamically. In addition to the quantity of memory available the, location of the memory within a DC partition is dynamic, based on the extents offered by a device. CXL DAX regions must accommodate the dynamic movement of this memory in the management of DAX regions and device= s. Introduce the concept of a dynamic DAX region. Introduce create_dynamic_ram_a_region() sysfs entry to create such regions. Special case DC-capable regions to create a 0 sized seed DAX device to maintain compatibility which requires a default DAX device to hold a region reference. Indicate 0 byte available capacity until such time that capacity is added. Dynamic regions complicate the range mapping of dax devices. There is no known use case for range mapping on dynamic regions. Avoid the complication by preventing range mapping of dax devices on dynamic regions. Interleaving is deferred for now. Add checks. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] [anisa: change "sparse" naming conventions and to "dynamic"] --- Documentation/ABI/testing/sysfs-bus-cxl | 22 ++++++++--------- drivers/cxl/core/core.h | 11 +++++++++ drivers/cxl/core/port.c | 1 + drivers/cxl/core/region.c | 33 +++++++++++++++++++++++-- drivers/cxl/core/region_dax.c | 6 +++++ drivers/dax/bus.c | 10 ++++++++ drivers/dax/bus.h | 1 + drivers/dax/cxl.c | 17 +++++++++++-- 8 files changed, 86 insertions(+), 15 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/te= sting/sysfs-bus-cxl index c604c7ca6432..3080aef9ad67 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -434,20 +434,20 @@ Description: interleave_granularity). =20 =20 -What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region -Date: May, 2022, January, 2023 -KernelVersion: v6.0 (pmem), v6.3 (ram) +What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram,dynamic_ram_a}_reg= ion +Date: May, 2022, January, 2023, May 2025 +KernelVersion: v6.0 (pmem), v6.3 (ram), v6.16 (dynamic_ram_a) Contact: linux-cxl@vger.kernel.org Description: (RW) Write a string in the form 'regionZ' to start the process - of defining a new persistent, or volatile memory region - (interleave-set) within the decode range bounded by root decoder - 'decoderX.Y'. The value written must match the current value - returned from reading this attribute. An atomic compare exchange - operation is done on write to assign the requested id to a - region and allocate the region-id for the next creation attempt. - EBUSY is returned if the region name written does not match the - current cached value. + of defining a new persistent, volatile, or dynamic RAM memory + region (interleave-set) within the decode range bounded by root + decoder 'decoderX.Y'. The value written must match the current + value returned from reading this attribute. An atomic compare + exchange operation is done on write to assign the requested id + to a region and allocate the region-id for the next creation + attempt. EBUSY is returned if the region name written does not + match the current cached value. =20 =20 What: /sys/bus/cxl/devices/decoderX.Y/delete_region diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 82ca3a476708..8881cc9323e0 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -6,6 +6,7 @@ =20 #include #include +#include =20 extern const struct device_type cxl_nvdimm_bridge_type; extern const struct device_type cxl_nvdimm_type; @@ -18,6 +19,15 @@ enum cxl_detach_mode { DETACH_INVALIDATE, }; =20 +static inline struct cxl_memdev_state * +cxled_to_mds(struct cxl_endpoint_decoder *cxled) +{ + struct cxl_memdev *cxlmd =3D cxled_to_memdev(cxled); + struct cxl_dev_state *cxlds =3D cxlmd->cxlds; + + return container_of(cxlds, struct cxl_memdev_state, cxlds); +} + #ifdef CONFIG_CXL_REGION =20 struct cxl_region_context { @@ -29,6 +39,7 @@ struct cxl_region_context { =20 extern struct device_attribute dev_attr_create_pmem_region; extern struct device_attribute dev_attr_create_ram_region; +extern struct device_attribute dev_attr_create_dynamic_ram_a_region; extern struct device_attribute dev_attr_delete_region; extern struct device_attribute dev_attr_region; extern const struct device_type cxl_pmem_region_type; diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index a7f71f36531f..2d33001dac26 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -337,6 +337,7 @@ static struct attribute *cxl_decoder_root_attrs[] =3D { &dev_attr_qos_class.attr, SET_CXL_REGION_ATTR(create_pmem_region) SET_CXL_REGION_ATTR(create_ram_region) + SET_CXL_REGION_ATTR(create_dynamic_ram_a_region) SET_CXL_REGION_ATTR(delete_region) NULL, }; diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index edc267c6cf77..7561bf3d8af8 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -493,6 +493,11 @@ static int set_interleave_ways(struct cxl_region *cxlr= , int val) int save, rc; u8 iw; =20 + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A && val !=3D 1) { + dev_err(&cxlr->dev, "Interleaving and DCD not supported\n"); + return -EINVAL; + } + rc =3D ways_to_eiw(val, &iw); if (rc) return rc; @@ -2389,6 +2394,7 @@ static size_t store_targetN(struct cxl_region *cxlr, = const char *buf, int pos, if (sysfs_streq(buf, "\n")) rc =3D detach_target(cxlr, pos); else { + struct cxl_endpoint_decoder *cxled; struct device *dev; =20 dev =3D bus_find_device_by_name(&cxl_bus_type, NULL, buf); @@ -2400,8 +2406,14 @@ static size_t store_targetN(struct cxl_region *cxlr,= const char *buf, int pos, goto out; } =20 - rc =3D attach_target(cxlr, to_cxl_endpoint_decoder(dev), pos, - TASK_INTERRUPTIBLE); + cxled =3D to_cxl_endpoint_decoder(dev); + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A && + !cxl_dcd_supported(cxled_to_mds(cxled))) { + dev_dbg(dev, "DCD unsupported\n"); + rc =3D -EINVAL; + goto out; + } + rc =3D attach_target(cxlr, cxled, pos, TASK_INTERRUPTIBLE); out: put_device(dev); } @@ -2750,6 +2762,7 @@ static struct cxl_region *__create_region(struct cxl_= root_decoder *cxlrd, switch (mode) { case CXL_PARTMODE_RAM: case CXL_PARTMODE_PMEM: + case CXL_PARTMODE_DYNAMIC_RAM_A: break; default: dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode); @@ -2802,6 +2815,21 @@ static ssize_t create_ram_region_store(struct device= *dev, } DEVICE_ATTR_RW(create_ram_region); =20 +static ssize_t create_dynamic_ram_a_region_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + return __create_region_show(to_cxl_root_decoder(dev), buf); +} + +static ssize_t create_dynamic_ram_a_region_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + return create_region_store(dev, buf, len, CXL_PARTMODE_DYNAMIC_RAM_A); +} +DEVICE_ATTR_RW(create_dynamic_ram_a_region); + static ssize_t region_show(struct device *dev, struct device_attribute *at= tr, char *buf) { @@ -4081,6 +4109,7 @@ static int cxl_region_probe(struct device *dev) =20 return devm_cxl_add_pmem_region(cxlr); case CXL_PARTMODE_RAM: + case CXL_PARTMODE_DYNAMIC_RAM_A: rc =3D devm_cxl_region_edac_register(cxlr); if (rc) dev_dbg(&cxlr->dev, "CXL EDAC registration for region_id=3D%d failed\n", diff --git a/drivers/cxl/core/region_dax.c b/drivers/cxl/core/region_dax.c index de04f78f6ad8..d6bf69155827 100644 --- a/drivers/cxl/core/region_dax.c +++ b/drivers/cxl/core/region_dax.c @@ -84,6 +84,12 @@ int devm_cxl_add_dax_region(struct cxl_region *cxlr) struct device *dev; int rc; =20 + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A && + cxlr->params.interleave_ways !=3D 1) { + dev_err(&cxlr->dev, "Interleaving DC not supported\n"); + return -EINVAL; + } + struct cxl_dax_region *cxlr_dax __free(put_cxl_dax_region) =3D cxl_dax_region_alloc(cxlr); if (IS_ERR(cxlr_dax)) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 95aee2a037fb..b0c2162b5e37 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -181,6 +181,11 @@ static bool is_static(struct dax_region *dax_region) return (dax_region->res.flags & IORESOURCE_DAX_STATIC) !=3D 0; } =20 +static bool is_dynamic(struct dax_region *dax_region) +{ + return (dax_region->res.flags & IORESOURCE_DAX_DCD) !=3D 0; +} + bool static_dev_dax(struct dev_dax *dev_dax) { return is_static(dev_dax->region); @@ -304,6 +309,9 @@ static unsigned long long dax_region_avail_size(struct = dax_region *dax_region) =20 lockdep_assert_held(&dax_region_rwsem); =20 + if (is_dynamic(dax_region)) + return 0; + for_each_dax_region_resource(dax_region, res) size -=3D resource_size(res); return size; @@ -1389,6 +1397,8 @@ static umode_t dev_dax_visible(struct kobject *kobj, = struct attribute *a, int n) return 0; if (a =3D=3D &dev_attr_mapping.attr && is_static(dax_region)) return 0; + if (a =3D=3D &dev_attr_mapping.attr && is_dynamic(dax_region)) + return 0; if ((a =3D=3D &dev_attr_align.attr || a =3D=3D &dev_attr_size.attr) && is_static(dax_region)) return 0444; diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index 5909171a4428..6e739bfab932 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -15,6 +15,7 @@ struct dax_region; /* dax bus specific ioresource flags */ #define IORESOURCE_DAX_STATIC BIT(0) #define IORESOURCE_DAX_KMEM BIT(1) +#define IORESOURCE_DAX_DCD BIT(2) =20 struct dax_region *alloc_dax_region(struct device *parent, int region_id, struct range *range, int target_node, unsigned int align, diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 3ab39b77843d..f58fe992aa8d 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -13,19 +13,32 @@ static int cxl_dax_region_probe(struct device *dev) struct cxl_region *cxlr =3D cxlr_dax->cxlr; struct dax_region *dax_region; struct dev_dax_data data; + resource_size_t dev_size; + unsigned long flags; =20 if (nid =3D=3D NUMA_NO_NODE) nid =3D memory_add_physaddr_to_nid(cxlr_dax->hpa_range.start); =20 + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A) + flags =3D IORESOURCE_DAX_DCD; + else + flags =3D IORESOURCE_DAX_KMEM; + dax_region =3D alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid, - PMD_SIZE, IORESOURCE_DAX_KMEM); + PMD_SIZE, flags); if (!dax_region) return -ENOMEM; =20 + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A) + /* Add empty seed dax device */ + dev_size =3D 0; + else + dev_size =3D range_len(&cxlr_dax->hpa_range); + data =3D (struct dev_dax_data) { .dax_region =3D dax_region, .id =3D -1, - .size =3D range_len(&cxlr_dax->hpa_range), + .size =3D dev_size, .memmap_on_memory =3D true, }; =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f43.google.com (mail-dl1-f43.google.com [74.125.82.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 39A79386422 for ; Sat, 23 May 2026 09:43:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529435; cv=none; b=pIcHVJcSsXiKgBgqQQG4AEK5X7YGVLOkAwFKujRWD0TIGxn0+uC+INvnqAJDKx6R4PYb/iZtWUTWVb3QnfcLGhKAyou8hgIwQtvCryLtgvQIUCTTspAhSuerd70l4G63xkumiJ9CrowFkCjN7HKSId+lilEU6mS9sdqTbqynbQw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529435; c=relaxed/simple; bh=TRNy5M/WN3qqAniDdN+oE2OFskxJxYDlCStFug8uZT4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p29PqfuAqU18dDthovrXcd8O2db7nRiTvTlfYz7EfAMgYLaO7volIBhWcb+QeJdxWF6VjbUPhJCzZ+7Vw5rCND2wKLFgupq1TImQbdSGKEgYedFumpWEjqYCggwhcUs2fDUZ0I2z2WGr5vgmg0OM0WUpgmVjflwVD82a+hzQY3s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gAECiOy+; arc=none smtp.client-ip=74.125.82.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gAECiOy+" Received: by mail-dl1-f43.google.com with SMTP id a92af1059eb24-130c9dcbd25so5271428c88.1 for ; Sat, 23 May 2026 02:43:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529431; x=1780134231; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=s6d+nwILJMu/HApS7ffo+r9TZ58+uVhFX0f64Cee/yY=; b=gAECiOy+XAOqgHZWBqzWk4XESAncrOn4/i34Rz8ZrOzP1iRUEWCamu59isinTPfBER eH3NU+Xw6xjdSs66dRurNrgEV8POIIsr4UTlGcDH6AHKDIWgf5ij7msqvEIkgPMnGuFk kOusT04r/iu1/U5yi9maHMT+bl7oG6c5zN2nzpK69ZT8ftfy/8Oz4m9FrGRXi1fIjoQy O12HJwscEliabKujnJlF7YRKivMDA+PBGnsGldvvVdc4hp4GoZHmB2hn7VK3WQ9VM+0/ T5r8+evrjj2jFA6VZdrhFKt574dW3cx7JSm+35Cg0s8vBTq6QxTEJ6KOSX5hWwZl1H72 VlkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529431; x=1780134231; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=s6d+nwILJMu/HApS7ffo+r9TZ58+uVhFX0f64Cee/yY=; b=n2W3DoAtIR8mei3JfsznBkUR0fvkqx6R3OV6yoWbcyo1A7Xa8qWcuU/1RhEvl9n1d6 OvZwCZnJsCukMROSu0cAu4kUEdbSuTH1vCVGtIxVu5zf5YTtCMTWSdScDvfNCYKrLCXO KrowvREskD8z3h5pfYcpjWbgmUMxceq18M7CNIzmky+TspPSPM8lIcf/SjLG9T9c8DDq xea/q1SB96D/t8uWaqDETqAS8omtxKdPNTe/Ogml1vTdw0ZgY9LqdfhCoZV8T0bxAYGz N1FxwdQ3nGaUKo47HNrclEM3+lgI+CJuOVOeAmml+5QLHrG211hdQadDoffAPKxT3PeZ 3SKg== X-Forwarded-Encrypted: i=1; AFNElJ9HhtQFfsNyBup2lSLCD1a9H/3q9WKsEtwpxTfbf/p52GNctSSuHGmYFhF7nesOZ3zRY7N4voqsM/OGU9g=@vger.kernel.org X-Gm-Message-State: AOJu0YzjNybr5sGjMEFoj5id4XlIl5cspYj6Jks2Kbbmtxr7t7Xkvsv1 18vv6XAG+omjr/j9pobH7ULBuVOcCZON6CGk//FgHG39ntxkjDyKQeph X-Gm-Gg: Acq92OFcSHH2HwP6grRoN/e8RS2TKmS1zoGZTtf1nVbVO6mluCmq25mxYOEjmLOzNhp P2CSmojoJSMf31LPQCg2UKY2EykZKEZFHNCIKyzep3fRR+LASvzI1nccIXhPPeF8y0Y6AJlOvf4 HPeqN82LejPgmOyvJ58GrrUnmxhWpJbm1p2gjfKtWnkGAJCBD7geC5X2SfcEV9p95bPZw3xFeVe SxD0cwJNukwIK1i2hCgA4GqKIbqPz4E7PkUgGvBqBgK0bvtlHJ3bbyMmxb5BwCwuXSQSqD8R+O1 4bSZvYuUPz0trR94sQ8SRHR+6y9PuZvqvmjcFzo/mOJnRSlLAy/pDmM6qbui8eskS+klwhMksuf SzB54r5064/Pm9t86s0KLMA1Sb+/okwC2Dl1ZktKVTlZiQpAuYnG+0QHra24RZIq9yKgVPrOS8z /Xa8FsmKsTkFh5/Y2IFP1poSpokvJHTV8C6NYId1Z1uIQ1tjkqhTsqwILj9SS3lS+XM3gJoMiRc YBg22k= X-Received: by 2002:a05:7022:f91:b0:129:1d25:f1da with SMTP id a92af1059eb24-13633a69b63mr4019288c88.3.1779529431282; Sat, 23 May 2026 02:43:51 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:50 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron , Fan Ni , Li Ming Subject: [PATCH v10 08/31] cxl/events: Split event msgnum configuration from irq setup Date: Sat, 23 May 2026 02:43:02 -0700 Message-ID: <2906584012fc147ecf67578022a78415f60f73ce.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Dynamic Capacity Devices (DCD) require event interrupts to process memory addition or removal. BIOS may have control over non-DCD event processing. DCD interrupt configuration needs to be separate from memory event interrupt configuration. Split cxl_event_config_msgnums() from irq setup in preparation for separate DCD interrupts configuration. Reviewed-by: Jonathan Cameron Reviewed-by: Fan Ni Reviewed-by: Dave Jiang Reviewed-by: Li Ming Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] --- drivers/cxl/pci.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 60f9fa05d9ef..35942b2ace53 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -599,35 +599,31 @@ static int cxl_event_config_msgnums(struct cxl_memdev= _state *mds, return cxl_event_get_int_policy(mds, policy); } =20 -static int cxl_event_irqsetup(struct cxl_memdev_state *mds) +static int cxl_event_irqsetup(struct cxl_memdev_state *mds, + struct cxl_event_interrupt_policy *policy) { struct cxl_dev_state *cxlds =3D &mds->cxlds; - struct cxl_event_interrupt_policy policy; int rc; =20 - rc =3D cxl_event_config_msgnums(mds, &policy); - if (rc) - return rc; - - rc =3D cxl_event_req_irq(cxlds, policy.info_settings); + rc =3D cxl_event_req_irq(cxlds, policy->info_settings); if (rc) { dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n"); return rc; } =20 - rc =3D cxl_event_req_irq(cxlds, policy.warn_settings); + rc =3D cxl_event_req_irq(cxlds, policy->warn_settings); if (rc) { dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n"); return rc; } =20 - rc =3D cxl_event_req_irq(cxlds, policy.failure_settings); + rc =3D cxl_event_req_irq(cxlds, policy->failure_settings); if (rc) { dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n"); return rc; } =20 - rc =3D cxl_event_req_irq(cxlds, policy.fatal_settings); + rc =3D cxl_event_req_irq(cxlds, policy->fatal_settings); if (rc) { dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n"); return rc; @@ -674,11 +670,15 @@ static int cxl_event_config(struct pci_host_bridge *h= ost_bridge, return -EBUSY; } =20 + rc =3D cxl_event_config_msgnums(mds, &policy); + if (rc) + return rc; + rc =3D cxl_mem_alloc_event_buf(mds); if (rc) return rc; =20 - rc =3D cxl_event_irqsetup(mds); + rc =3D cxl_event_irqsetup(mds, &policy); if (rc) return rc; =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dy1-f169.google.com (mail-dy1-f169.google.com [74.125.82.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3954B386C15 for ; Sat, 23 May 2026 09:43:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529436; cv=none; b=nW3wQwbMt7ArC9tuPTL4LXjLHx5senNX7Z1n8mdpe/EVEj1PIPplOhZ84yqnboDnraLJynbE7aqwbPQpsfBXs0BtFFmbkUrsAaoMVZD4OWWWvGww+zx6TaNQRSFszosQHM700Wp9oVsEL+/P3unCFZ5Pn1/XwE1S+hbNzKKPI9w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529436; c=relaxed/simple; bh=cA6OvtiMYiH8fWm6hs/pB2NqRBhSnJLhmhXOqE10cNA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rfUsQfYNx1A/F8a+qH6109BNU22QG8ZTVMtzI1vzgnXQRwVAODZ0cH2NwVEzTExG4YxjBlXyeVJOGv5meREYmy+NVmTr1kbY8PASJst4WgYqDk9pOddCbSfqNOEhndrBtVwMY9d7ksnQFohYeJxEfnWaUK9m0rpeoita1G89Xrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=qpi7iN2b; arc=none smtp.client-ip=74.125.82.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qpi7iN2b" Received: by mail-dy1-f169.google.com with SMTP id 5a478bee46e88-2ef2a1cc06dso12795989eec.0 for ; Sat, 23 May 2026 02:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529433; x=1780134233; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=veSAHzzDLBJHWUT6ObwVQlgXBETLjKgsofHDN9ZwSyE=; b=qpi7iN2bE/i6DxBZNgs/At8dl3Zlt7X3cG6DGwIuObIOOkB/3ylh3b4H9/mYgjGlwZ PB84zWS1Ks3dR4epH0IyrWIPBgc15oDctMAisY1VEBOMJTM05+cvhOwGbKKUB2yeWV8E bKJt/TxZaNvaxzUn7qoWia6Nj+SC3OucyDs+QPw//R+4EGNRkm2zMIXZyZGq/en8Tfk8 /ThPXghF2P6nMNiIc16oeQwhDFxOJxQGnE4mGvTxUKkqqs5KPmY/R46tFdm+EErU1ZSG CQjSM4Pgduwnn7uPwYOck/EhGWXuejDdJ4XzYpvUoktDlFBIUTKUFMFNda4QxcgR/y4M 3ixw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529433; x=1780134233; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=veSAHzzDLBJHWUT6ObwVQlgXBETLjKgsofHDN9ZwSyE=; b=Fdiy01upQv6FPtcV6puEesU45hF4vJj+z0fmyYAbOetoDsaaZHS3/mCDwBQ36ZY1ZA 9GjrKB35PGPr2DtNfcJUfRC/FDFOpKispTJ0FlvMaOy7A+dequgGmsm6dCYnlNNPjlFb H4QhJIt5dMGYEkKyPW4Tr8sf1FY3mXNZgOUDjr8MoGQhO4n3kCWMIPkFjO/Cu8PVjGok EGNBms0ReSwaubmZLwlE94JhSKBMCJDRoUaPGg4CnZIXM3+xCqpknNTbR6Bt8qXiwSmI YHkeQ0vf8kilnuDmtq65CpX6gTkpkVdbL0MknHz9j9jxwc/JSXRG4x0CX8trOnJXFf0i zF+A== X-Forwarded-Encrypted: i=1; AFNElJ+7FUqy3TJWAQr/Ln0uNLRVuuuaigIVjcV+BT1K8ek4oSlWfw94JJBEmhDWfLJeEU/+rqE+WTKZRYTMtaY=@vger.kernel.org X-Gm-Message-State: AOJu0YxoJN3YAZiEvOarof65T4ZaVcWL767cSUJ4E+pK6xLeV+wljktp fI0ve9gVFbzkprsTIMM7HomwHcbiQtZxfsWkNOUgRdMh36E+AAX2ylbM X-Gm-Gg: Acq92OGMohJJVoHFbI5HGQg12ZMZ5nOnhKdkQK/bYhxnuFLJePFgqEBx7Thub5n/acd K82MRQIQBiOpXHvL7SRHaGhTVDDTu0L3Q0ry1usxGFuwRs1lSXTXSVy3sz6TLpMwtvLJgymO6eH 6cTVJlYySirR2v8ImdHiupvGk5xkUz1qESNU0ykXpaaWMvL/FNgRmqaekmBkOYR/+8xzaAX+7hJ XqwLjW47mnvPQALh6hneqpP2Cxk8MYDZtRvszv/eUs1lp+om1EDUo5XyAlV7HJ279OGSvwfcBHn To41iNPTfEVLgMIoO3ipvIMT4qqlCcxOmnIlH944Gh6bar3wdcodJvDePbyHuM+n4yNH7h3URWc TSRbfoqyjpKQne/ZHrYte/hEYu811Vh0lc6qkPwEgqgiZosmN1Aa/L1g0Y1VWo1F8acq7Xlg5aa ms4njVqxux47xH1qBM2TNrTmaPqx1apvKnaVlLu3+rHu6vT/LChh0LCBc5rJS1pbi0sriguAbSv l+wByViaftSp3fMNg== X-Received: by 2002:a05:7022:1e11:b0:130:6c8f:5a87 with SMTP id a92af1059eb24-1365f81e50amr3002026c88.13.1779529433221; Sat, 23 May 2026 02:43:53 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:52 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron , Fan Ni , Li Ming , Dan Williams Subject: [PATCH v10 09/31] cxl/pci: Factor out interrupt policy check Date: Sat, 23 May 2026 02:43:03 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Dynamic Capacity Devices (DCD) require event interrupts to process memory addition or removal. BIOS may have control over non-DCD event processing. DCD interrupt configuration needs to be separate from memory event interrupt configuration. Factor out event interrupt setting validation. Reviewed-by: Dave Jiang Reviewed-by: Jonathan Cameron Reviewed-by: Fan Ni Reviewed-by: Li Ming Link: https://lore.kernel.org/all/663922b475e50_d54d72945b@dwillia2-xfh.jf.= intel.com.notmuch/ [1] Suggested-by: Dan Williams Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] --- drivers/cxl/pci.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 35942b2ace53..8d12c684d670 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -639,6 +639,21 @@ static bool cxl_event_int_is_fw(u8 setting) return mode =3D=3D CXL_INT_FW; } =20 +static bool cxl_event_validate_mem_policy(struct cxl_memdev_state *mds, + struct cxl_event_interrupt_policy *policy) +{ + if (cxl_event_int_is_fw(policy->info_settings) || + cxl_event_int_is_fw(policy->warn_settings) || + cxl_event_int_is_fw(policy->failure_settings) || + cxl_event_int_is_fw(policy->fatal_settings)) { + dev_err(mds->cxlds.dev, + "FW still in control of Event Logs despite _OSC settings\n"); + return false; + } + + return true; +} + static int cxl_event_config(struct pci_host_bridge *host_bridge, struct cxl_memdev_state *mds, bool irq_avail) { @@ -661,14 +676,8 @@ static int cxl_event_config(struct pci_host_bridge *ho= st_bridge, if (rc) return rc; =20 - if (cxl_event_int_is_fw(policy.info_settings) || - cxl_event_int_is_fw(policy.warn_settings) || - cxl_event_int_is_fw(policy.failure_settings) || - cxl_event_int_is_fw(policy.fatal_settings)) { - dev_err(mds->cxlds.dev, - "FW still in control of Event Logs despite _OSC settings\n"); + if (!cxl_event_validate_mem_policy(mds, &policy)) return -EBUSY; - } =20 rc =3D cxl_event_config_msgnums(mds, &policy); if (rc) --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f52.google.com (mail-dl1-f52.google.com [74.125.82.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82D42386C31 for ; Sat, 23 May 2026 09:43:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529437; cv=none; b=s6NI0nvVITcI38JhPXRhcNiHrQGy5Ub8cbjjbNHd0a94BUtqZLwvcyAjjqNFWkkB2p8kCqTowLiF0wiKFhMvXz1ky0GjEGYVp2emus3a/ayF9E1AtaIWVGQEZ5x/52FjTMOKy8/1iugR7291iDv6z65J5sthZ7gI4BaFyKmRrsA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529437; c=relaxed/simple; bh=WdPnNZdBMfPNVkxZQcUkvrepHszswdE8ldfahIBUbPc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nToZeDnrIDGIJsi2/1t96zM+vaGmgzcw4V1vbaX5fxsbu5Du+61QG160qswn8F2yBkFHdhRgbhiRDq0AwGp9pu0i1q7mjHR8TTw8jmyd4Ue5YpD2R+aKCUSMw16D4GjD4WExUOibDGNnKZu2y/mo6ZXaO7xRLceSA2HBClrEzV0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IaUZzPlC; arc=none smtp.client-ip=74.125.82.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IaUZzPlC" Received: by mail-dl1-f52.google.com with SMTP id a92af1059eb24-132830d8281so2577270c88.1 for ; Sat, 23 May 2026 02:43:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529435; x=1780134235; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=drGooomvIKHMHD9aQDfS7/igAqAzAvEceRuTEAEvyv8=; b=IaUZzPlC3yvqSx7EJcmUaz6UY621MICdFUzcDAscnW1BfYKhbnF7VIcZI0bAzQZRLQ GEDhhFcuapnFfThVxkh4uo25TlAkaPGMlJAA6Qfk9z44kRHiUea3aZynCTrIGWKwqtN1 iGZPKJGVlMGUGqg3lHejEf34XUOzIuDQJm8p1Az2ama5SMReKDYgeaomldsSP1syjTbI yhUY2UTJj4RDhaFWb+PRNMzoBpgsdUR6Yasd9D2/VFrKKv/DXRDccq0QDr4NQ2MzIdh7 /iv9ncHUPqrvmw/mS9TO/evtDrCH7N7xB8o/vzY4Ur7whvEUJ2qFgLLtV1UF+ubMXKyy zXJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529435; x=1780134235; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=drGooomvIKHMHD9aQDfS7/igAqAzAvEceRuTEAEvyv8=; b=LbEaSWSqLyJM2z/zKzIu0m5f2Vwaij6h8jGRARebCYCOrCVQpZmWe4C8eWIRDu3bls tGJa64GgrSyvlvIuUJy+hx4iUaBHf0IAF26nkvUdCr5x9ATPblW/B6I3pBn+QWGUwM09 pAYZA3YmCnz1E2vR0IlC8UuaBr5uHFZZ2NhVSQscG4sKpw38G/e2xNLIYVlK840q+TMy 3QwiDY/6zaAIpV5+PXz7D5yTxqNVFeNIhZ5vxd2hYx0pSWKBwgFBcNgzkBWz+v/iBdnZ qMsbqYNw8SZklU8iGXtmv4aQDDxdW+n5w1ykDE1Hvw+oy1xzGz+fLZArMm+hDjQ88H4s O2JQ== X-Forwarded-Encrypted: i=1; AFNElJ8iz1E/5Kq5zsDr0PN0krAshroYPFoNpv9VgoBy4tGawZrZKpFpdUg0uWLh4VlL4W5s7QZk65BDKzkJ0B4=@vger.kernel.org X-Gm-Message-State: AOJu0YwL4f6ymx1A0c6goiDCKmU2hKrVW9aladeNfuwf/griUjZ7/UaA 7/d8QEFGV8/tv8VVjpXijuzaxOnVB6em3XBg/SUAHSV0xbszwHtm6Ya0 X-Gm-Gg: Acq92OHHAEjM2c2RgL2VNDS2iBHvHxkQOeHbhKO4TIs4iU/rpnuuCarzenHhb7vVjX3 vGFWHVfb8vee8yq0gzwFDPrcUSkf02fr6eUqyOukuMvuyxiydhSF/7JvedmW7NyPlpCsvV9meh8 CgWeAQFHw7XduA3OaTCVnzPqHsF8JJy0ag5LxdAFOw/ybkcBy5jQZUuJMGEy1aQ2Vz4K6+GeA+4 rTPqvNCURDzH37D0+4h1zWG/Vm5QK5wHCJ1s5JdgL7zVXbvJ8HaSOa6rUPPvovRQ0Yt4d94f6Ae YADo4SwyX32nZcK0V7STA+T47c2BIT4mQrEEgtLzy/HWb1kBT/pc2ZDVpJQEE3xQfkDKPvd+LbH iGZfURD6HWCoVLLgkWQGtw5HGLWQHa1da0jeNiEghR9aV64EPiIpmenocmJIB0QT5Bh5v6OjHqm EWo0XO1g+y9Z2SVx0PIHo7nSW4c8m7lKruVBXIpQkoUOegaTcGzMz0xf+NaBr3Ei+QNCtyZmOXf zAk8OM= X-Received: by 2002:a05:7022:511:b0:134:d708:1a24 with SMTP id a92af1059eb24-1365f8161bemr2409921c88.17.1779529434676; Sat, 23 May 2026 02:43:54 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:54 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Anisa Su Subject: [PATCH v10 10/31] cxl/mem: Configure dynamic capacity interrupts Date: Sat, 23 May 2026 02:43:04 -0700 Message-ID: <7f2e4fe385415e0b77b58f4bd988bc5895557dcf.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Dynamic Capacity Devices (DCD) support extent change notifications through the event log mechanism. The interrupt mailbox commands were extended in CXL 3.1 to support these notifications. Firmware can't configure DCD events to be FW controlled but can retain control of memory events. Configure DCD event log interrupts on devices supporting dynamic capacity. Disable DCD if interrupts are not supported. Care is taken to preserve the interrupt policy set by the FW if FW first has been selected by the BIOS. Accept the 4-byte CXL 2.0 reply on GET Event Interrupt Policy by setting min_out to CXL_EVENT_INT_POLICY_BASE_SIZE; pre-CXL 3.1 firmware omits dcd_settings and would otherwise fail the size check. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: rebase] [anisa: accept 4-byte CXL 2.0 GET reply via min_out] [anisa: drop Reviewed-by tags now that the patch carries new changes] --- drivers/cxl/cxlmem.h | 2 ++ drivers/cxl/pci.c | 75 ++++++++++++++++++++++++++++++++++++-------- 2 files changed, 64 insertions(+), 13 deletions(-) diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 10175ca3b7ee..65c009b02da6 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -218,7 +218,9 @@ struct cxl_event_interrupt_policy { u8 warn_settings; u8 failure_settings; u8 fatal_settings; + u8 dcd_settings; } __packed; +#define CXL_EVENT_INT_POLICY_BASE_SIZE 4 /* info, warn, failure, fatal */ =20 /** * struct cxl_event_state - Event log driver state diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 8d12c684d670..83617439bbd3 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -557,6 +557,8 @@ static int cxl_event_get_int_policy(struct cxl_memdev_s= tate *mds, .opcode =3D CXL_MBOX_OP_GET_EVT_INT_POLICY, .payload_out =3D policy, .size_out =3D sizeof(*policy), + /* CXL 2.0 firmware omits dcd_settings; accept the shorter reply */ + .min_out =3D CXL_EVENT_INT_POLICY_BASE_SIZE, }; int rc; =20 @@ -569,23 +571,34 @@ static int cxl_event_get_int_policy(struct cxl_memdev= _state *mds, } =20 static int cxl_event_config_msgnums(struct cxl_memdev_state *mds, - struct cxl_event_interrupt_policy *policy) + struct cxl_event_interrupt_policy *policy, + bool native_cxl) { struct cxl_mailbox *cxl_mbox =3D &mds->cxlds.cxl_mbox; + size_t size_in =3D CXL_EVENT_INT_POLICY_BASE_SIZE; struct cxl_mbox_cmd mbox_cmd; int rc; =20 - *policy =3D (struct cxl_event_interrupt_policy) { - .info_settings =3D CXL_INT_MSI_MSIX, - .warn_settings =3D CXL_INT_MSI_MSIX, - .failure_settings =3D CXL_INT_MSI_MSIX, - .fatal_settings =3D CXL_INT_MSI_MSIX, - }; + /* memory event policy is left if FW has control */ + if (native_cxl) { + *policy =3D (struct cxl_event_interrupt_policy) { + .info_settings =3D CXL_INT_MSI_MSIX, + .warn_settings =3D CXL_INT_MSI_MSIX, + .failure_settings =3D CXL_INT_MSI_MSIX, + .fatal_settings =3D CXL_INT_MSI_MSIX, + .dcd_settings =3D 0, + }; + } + + if (cxl_dcd_supported(mds)) { + policy->dcd_settings =3D CXL_INT_MSI_MSIX; + size_in +=3D sizeof(policy->dcd_settings); + } =20 mbox_cmd =3D (struct cxl_mbox_cmd) { .opcode =3D CXL_MBOX_OP_SET_EVT_INT_POLICY, .payload_in =3D policy, - .size_in =3D sizeof(*policy), + .size_in =3D size_in, }; =20 rc =3D cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); @@ -632,6 +645,30 @@ static int cxl_event_irqsetup(struct cxl_memdev_state = *mds, return 0; } =20 +static int cxl_irqsetup(struct cxl_memdev_state *mds, + struct cxl_event_interrupt_policy *policy, + bool native_cxl) +{ + struct cxl_dev_state *cxlds =3D &mds->cxlds; + int rc; + + if (native_cxl) { + rc =3D cxl_event_irqsetup(mds, policy); + if (rc) + return rc; + } + + if (cxl_dcd_supported(mds)) { + rc =3D cxl_event_req_irq(cxlds, policy->dcd_settings); + if (rc) { + dev_err(cxlds->dev, "Failed to get interrupt for DCD event log\n"); + cxl_disable_dcd(mds); + } + } + + return 0; +} + static bool cxl_event_int_is_fw(u8 setting) { u8 mode =3D FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting); @@ -657,18 +694,26 @@ static bool cxl_event_validate_mem_policy(struct cxl_= memdev_state *mds, static int cxl_event_config(struct pci_host_bridge *host_bridge, struct cxl_memdev_state *mds, bool irq_avail) { - struct cxl_event_interrupt_policy policy; + struct cxl_event_interrupt_policy policy =3D { 0 }; + bool native_cxl =3D host_bridge->native_cxl_error; int rc; =20 /* * When BIOS maintains CXL error reporting control, it will process * event records. Only one agent can do so. + * + * If BIOS has control of events and DCD is not supported skip event + * configuration. */ - if (!host_bridge->native_cxl_error) + if (!native_cxl && !cxl_dcd_supported(mds)) return 0; =20 if (!irq_avail) { dev_info(mds->cxlds.dev, "No interrupt support, disable event processing= .\n"); + if (cxl_dcd_supported(mds)) { + dev_info(mds->cxlds.dev, "DCD requires interrupts, disable DCD\n"); + cxl_disable_dcd(mds); + } return 0; } =20 @@ -676,10 +721,10 @@ static int cxl_event_config(struct pci_host_bridge *h= ost_bridge, if (rc) return rc; =20 - if (!cxl_event_validate_mem_policy(mds, &policy)) + if (native_cxl && !cxl_event_validate_mem_policy(mds, &policy)) return -EBUSY; =20 - rc =3D cxl_event_config_msgnums(mds, &policy); + rc =3D cxl_event_config_msgnums(mds, &policy, native_cxl); if (rc) return rc; =20 @@ -687,12 +732,16 @@ static int cxl_event_config(struct pci_host_bridge *h= ost_bridge, if (rc) return rc; =20 - rc =3D cxl_event_irqsetup(mds, &policy); + rc =3D cxl_irqsetup(mds, &policy, native_cxl); if (rc) return rc; =20 cxl_mem_get_event_records(mds, CXLDEV_EVENT_STATUS_ALL); =20 + dev_dbg(mds->cxlds.dev, "Event config : %s DCD %s\n", + native_cxl ? "OS" : "BIOS", + cxl_dcd_supported(mds) ? "supported" : "not supported"); + return 0; } =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f53.google.com (mail-dl1-f53.google.com [74.125.82.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 10BE23876BE for ; Sat, 23 May 2026 09:43:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529445; cv=none; b=nHtjewJ8srLhdnw3w/L4qVZTAHPDU9IQ2Vm6t/MztbaQao/pmKYM2BlPaCpanjkD0a70HMaUnAQJVO8a/80e9PBeuUh509AHMShkuykDl6hhzkeWur1kUxqLdj5jLNRyqYL8fTAzm3sviGKWoDCd/G5ws0OJaTY8kA45d3kr/po= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529445; c=relaxed/simple; bh=z38R1UrW2A3fP0de42veAZZx0/POxEvJRZoYfeRXlVg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YPp/G9sudOH1UDfuzsoGQHlinAmtWtfl5pLryay2KNvlM2OGtdXfJZ5j1EDyHwEi12dea3o9Gb24cMdGxCY8Tm7tkzu3OtRf1BCEeTJJao2P0zrfo9TZVdD8uUD4QQZFAPhTDRbBhQ+XD35RU4qYzcCnadxMg6QQ7zlkrFmpp14= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Q+DnLk1x; arc=none smtp.client-ip=74.125.82.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Q+DnLk1x" Received: by mail-dl1-f53.google.com with SMTP id a92af1059eb24-12ddbe104ccso6054171c88.0 for ; Sat, 23 May 2026 02:43:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529437; x=1780134237; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+KYd36Ajui1jTWwp0Hnf0okuHH/+vG5b+1VP/sgdrc0=; b=Q+DnLk1xVSVAQrUJ+B5Mxr+hmEn/cH55n6+gChRe4jnCAz0O1s6u4ieR0En94QhZUX vMwebahvBb290x8N+X3fE20XDqnrDt5OzMWCvKHvW9Y6rSibR+UNQeMR64f1ElJLMUFv DPZUIaKdoUWtyRrVMP6KruLgZYnRkS6EDwNd0+D2gRCZevwbeZ8lJgP5YuqBsvgzneC2 +vITouNJgwOqdT6rataZpyfA/mnoUOIV0ei9QzEEHBA8XLgnIsgwY3LvqR8Bh+ULfXZD nHYRZ3/UwrHLxYOXrVjG22uuXGik5yHLvRtTsux+9DepJ5QVP1w2RT+NpadzzNDkFCvs Fxrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529437; x=1780134237; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+KYd36Ajui1jTWwp0Hnf0okuHH/+vG5b+1VP/sgdrc0=; b=BBCwIU6ovjFHPq9gYMBTyhUwfElMm+s2BcNJlHVItd9H9VIuTqPMqQMJyJLBFpxzOL o9z3G+d1sfHUMaNLmWAwOHWkHR2DUBL125U5E0tLB99z/2WLG6R3E+GS1+EQfybIo+oj C5YM+Ot9LXvFeCb0SRXaJKARRu+YBOCeMWdbZj/ZboUxH90uNZXc0hrYb0kd3W7GK8sJ 6fbY7HMVmD6tVrNgSR2e5rd9kf/w5Yd2q+WjeKzei6Q2FUpghrWZmgrmiQKsgtRtSjoD BAyqo30iziA6UseWf37QD7OSg4mo0HOPR8UyGb9E8lrS7Y7geJVoOcHNEu0mV1B6to2k ZbQQ== X-Forwarded-Encrypted: i=1; AFNElJ80ddeAVrImFpTL9d8ftLcR8Qgke6LjILdOD9OOpDED2BHBmomSwYMVw1Cwm5PLQEbL3Jgn9/hrDZNYBps=@vger.kernel.org X-Gm-Message-State: AOJu0YzKcvoWwxlTeVle+I9uAGBITTltTn1mmgVNx1mK2TYwa1ngD1BP kllpCBmAbK9AHqOM294w3d32lJ+nu20Uxoih/aZmj64EMTdCd2GhHUAI X-Gm-Gg: Acq92OEXjUshqtN+RHhDGe92fN4aHTp3MW4/jhh+lukorBWZ/sHXed1JpW+rqR1yZ+x INmuh+3i0MrfWgq/taxXCF301QBAk1jrUfU3jOO+E5ONGNLppqEzlFkCRHXHa2t2fqZpZs/PGAE 0jnH7Vo6IMSyTyM/r08sclOqk4XHw7wMCR27sEv12M3+hz3V5YaA5/S0KQmRTxBw6mD3nSLihQV 7olQp9NuwCqIg8WBWzCn5flFXha3+2Tj8Nn5D+WqO6CpaWV3Ztja0lTOgFcMmtIXJHTqgNNuFTr 4lUD3UkJjr74k6Vf0v8tXuWAtH8Gp08N0QBSeDfc4x3lVWauwKbEwrDTeBssoqj+Zx3c+HiQd5k 1I+1v0ocRhEmaZfv1JxD1UqiM09XtRrSHPDzSPB72nYydVUR1SY/ZoD75fDD1LLkjRlryeX/2O6 pARpxA7zjezzl96oRUkpPRaLKl7y4hLkMaTJMnihB9U29T6Azu7jCmQv1Yld89craf/B7eAz5iK 3IM5ZtmUlxfrrbl6A== X-Received: by 2002:a05:7022:41aa:b0:12c:2cf8:2f30 with SMTP id a92af1059eb24-1365f811393mr2491801c88.15.1779529437160; Sat, 23 May 2026 02:43:57 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:56 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron , Fan Ni , Li Ming Subject: [PATCH v10 11/31] cxl/core: Return endpoint decoder information from region search Date: Sat, 23 May 2026 02:43:05 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny cxl_dpa_to_region() finds the region from a tuple. The search involves finding the device endpoint decoder as well. Dynamic capacity extent processing uses the endpoint decoder HPA information to calculate the HPA offset. In addition, well behaved extents should be contained within an endpoint decoder. Return the endpoint decoder found to be used in subsequent DCD code. Reviewed-by: Jonathan Cameron Reviewed-by: Fan Ni Reviewed-by: Dave Jiang Reviewed-by: Li Ming Reviewed-by: Alison Schofield Signed-off-by: Ira Weiny --- Changes: [anisa: rebase] --- drivers/cxl/core/core.h | 6 ++++-- drivers/cxl/core/mbox.c | 2 +- drivers/cxl/core/memdev.c | 4 ++-- drivers/cxl/core/region.c | 8 +++++++- 4 files changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 8881cc9323e0..14723cfd05f0 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -58,7 +58,8 @@ int cxl_decoder_detach(struct cxl_region *cxlr, int cxl_region_init(void); void cxl_region_exit(void); int cxl_get_poison_by_endpoint(struct cxl_port *port); -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa); +struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa, + struct cxl_endpoint_decoder **cxled); u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, u64 dpa); int devm_cxl_add_dax_region(struct cxl_region *cxlr); @@ -71,7 +72,8 @@ static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, return ULLONG_MAX; } static inline -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa) +struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa, + struct cxl_endpoint_decoder **cxled) { return NULL; } diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index f9a5e21f5d09..01b1a318f34f 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -968,7 +968,7 @@ void cxl_event_trace_record(struct cxl_memdev *cxlmd, guard(rwsem_read)(&cxl_rwsem.dpa); =20 dpa =3D le64_to_cpu(evt->media_hdr.phys_addr) & CXL_DPA_MASK; - cxlr =3D cxl_dpa_to_region(cxlmd, dpa); + cxlr =3D cxl_dpa_to_region(cxlmd, dpa, NULL); if (cxlr) { u64 cache_size =3D cxlr->params.cache_size; =20 diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 064cfd628577..b8b3489f69e5 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -320,7 +320,7 @@ int cxl_inject_poison_locked(struct cxl_memdev *cxlmd, = u64 dpa) if (rc) return rc; =20 - cxlr =3D cxl_dpa_to_region(cxlmd, dpa); + cxlr =3D cxl_dpa_to_region(cxlmd, dpa, NULL); if (cxlr) dev_warn_once(cxl_mbox->host, "poison inject dpa:%#llx region: %s\n", dpa, @@ -389,7 +389,7 @@ int cxl_clear_poison_locked(struct cxl_memdev *cxlmd, u= 64 dpa) if (rc) return rc; =20 - cxlr =3D cxl_dpa_to_region(cxlmd, dpa); + cxlr =3D cxl_dpa_to_region(cxlmd, dpa, NULL); if (cxlr) dev_warn_once(cxl_mbox->host, "poison clear dpa:%#llx region: %s\n", dpa, diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 7561bf3d8af8..733d77c07493 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -2991,6 +2991,7 @@ int cxl_get_poison_by_endpoint(struct cxl_port *port) struct cxl_dpa_to_region_context { struct cxl_region *cxlr; u64 dpa; + struct cxl_endpoint_decoder *cxled; }; =20 static int __cxl_dpa_to_region(struct device *dev, void *arg) @@ -3024,11 +3025,13 @@ static int __cxl_dpa_to_region(struct device *dev, = void *arg) dev_name(dev)); =20 ctx->cxlr =3D cxlr; + ctx->cxled =3D cxled; =20 return 1; } =20 -struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa) +struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa, + struct cxl_endpoint_decoder **cxled) { struct cxl_dpa_to_region_context ctx; struct cxl_port *port =3D cxlmd->endpoint; @@ -3042,6 +3045,9 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl= _memdev *cxlmd, u64 dpa) if (cxl_num_decoders_committed(port)) device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region); =20 + if (cxled) + *cxled =3D ctx.cxled; + return ctx.cxlr; } =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f46.google.com (mail-dl1-f46.google.com [74.125.82.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0888E3859D9 for ; Sat, 23 May 2026 09:44:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529444; cv=none; b=bPHXOwKpLxyzWoeqb7035wREGJv0jTQDNo5NP28ybGroGtmq3ZPOtpp+zzl2sZcILwtLq3Qgmg+Y5U19/njmoBugfntf2OIcBjTb+yDcpoYLw/XvOTO6FcMIydrgXZFm/8lmy52ruN/VyBnuUb/fur3XLn4mxAXiC1BvELj/YAM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529444; c=relaxed/simple; bh=ddCesjgMJiZ0L05KYHUHodUsMxYo3WI3I9Caw7IL1k8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QXH8Mx5cu12gtGJFSGVq+IcTi0HAh/wXtHIhTItev7Jraq5/O0uzw8FP/YUqhPJlWo5iOrs0P2BS89oh3kTc8eqhZPpjwXcDV/SmbTm0MGg98kow97gejhshDyHZg0UMVHCF4/W8VQw76Ov0+xyHaID4POBAFTkQ8pMIlNLo1R8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=celIF73F; arc=none smtp.client-ip=74.125.82.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="celIF73F" Received: by mail-dl1-f46.google.com with SMTP id a92af1059eb24-13663f68983so1944223c88.0 for ; Sat, 23 May 2026 02:44:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529441; x=1780134241; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bt/WWLiVqvvMQTuiQC0nRMt3xEqd7NiaZ1OPEgjDEDo=; b=celIF73Fh4eiq0uXtoi1SBGqTkAXH82O6WLwwtQAU/SkC+ad74KvD5Zzhf/eaIKb6c g28S09Ie8OJEBD1MPdw8CPleMeqlhGKjP+449nr6Zxh8APO8FmSCcmOxabZARzHl09YJ UeZNtMJnwmTIDMraF9ZH+TpJa5WqXRTZulzXfIa4i4Vu7PW4ydVVmcpeI8XPaDfbOFF8 GXh+rOPs3/pDzlBCX+LVxdRy0Uk/0NaKRk+vTjKsQg5723u6NlxBovanwyA97o7f1oCe MI1IHySu1DFtCGDQmL9drksx5eK9bBVc9Q2lKoIz1c7ZM6ZcEc5u7hbtomaLO0nBs73B +urw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529441; x=1780134241; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bt/WWLiVqvvMQTuiQC0nRMt3xEqd7NiaZ1OPEgjDEDo=; b=E0tcipJuaUEaoI3NcqNxJHwsX2h8F6N8eWKl5VyrWYCEjVlquCJXvpthxxMqbwBYBG Rgnk8gQgTnYLlQttX5hTRqExLcyZyL4goCDDSxlXU+reMx/8Ukejm9Hddkl1YuE21Che lRMiDuhl3qqv28sV+OEfYwkGbvYMOLHMmHXdbq84sVbDVqFlSkcpNQpk6i5xXoTkkWSS Ck8+maCqtVMZ2HTJMXWjtUWdYRZ4+r1p1MS9/OdaNTsjV2XULlUzPSEC+pJRW6Xp5hiQ /S2Zgf79A9ocOwdFl7H+74TuEx7xdsIuIOSLXtuPhRSm+7hq39PtsxY1KlgkyvHBocNZ 57cQ== X-Forwarded-Encrypted: i=1; AFNElJ+CXIjGE5o63uiyiVZq8vd9hKd2Ixc+5xzQFIJbefUjB6Ki0X8DlYIRmHfbYxhWwcFGDr7wDF68N/sEv7g=@vger.kernel.org X-Gm-Message-State: AOJu0YxgvJdvswCPH11CJqDHIZ6a8n5aqWCahlAYUKCUClEecwzPLkKo 7/SC1fKYk3cWFqKyYqYgAz3b3p0iGMr6muhC+r8PgtLVT3FcdXZfI/XD X-Gm-Gg: Acq92OEcvj8tHKLocpljgRIpkB0dFBrICSNqqDmnWebuMnrdywFG9OA850qMBOufxPL 9nWBv4dmLBhue5recU3mk4Oe8TStY00100SUYqFrF3f1T6LW4O205KYUey6Ha/5nKpuwWwAhzxz DMiOwKdiWTYiiTZosSOvbdBlh/pU8G1GnTMfmIXOZoFMAEare7Gi2l80i2kKXOQjdxPQyIXZ3/H qOgEU3dOTv1QbZ8oO00PCZXeffBiWKpT1IkAMCINDPyVwRMGbjZ/P9apr8mGkVSRuu4cr58dP9U +Ll8xIsXJ/h9ubd4M1+7lgNOuu/sE4aOi+RHeYUo6Xy/xlldXwGlICA4p3/IPC00Ar6opTQaolM +jQoMZqWnJ7XQG7rms2ESwAl+/eqywJ8yYt3DUjB5z4xMvreE2KmrZUucR+BYt5ZqZRYGsi7HGj cFOYeFy5zJcJbawL668ofxm5k4OC4ap5lcNgg3EfggArNE3tokUw55xSMbQmzMvj4yb25WQ5hyW Mm5/JfB+A7cEFAAGQ== X-Received: by 2002:a05:7022:128a:b0:135:dd7c:7 with SMTP id a92af1059eb24-1365fc6db3dmr2572825c88.38.1779529440747; Sat, 23 May 2026 02:44:00 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.43.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:43:59 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 12/31] cxl/mem: Set up framework for handling DC Events Date: Sat, 23 May 2026 02:43:06 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Adds the support for receiving DC event records but defers the real add/release logic to subsequent commits. Simply refuse all extents for DC_ADD and ack all DC_RELEASE events for now. Forced release is currently unsupported. In order, this commit adds the following: 1. Learn about DC Event Records and how to respond to them * cxl_mem_get_event_records() learns about the DC Event record. Records of that type are routed to cxl_handle_dcd_event_records(). * cxl_handle_dcd_event_records() switches on event_type: - DCD_ADD_CAPACITY -> handle_add_event() - DCD_RELEASE_CAPACITY -> cxl_rm_extent() - DCD_FORCED_CAPACITY_RELEASE is logged and ignored (FM/device-only). * cxl_send_dc_response() sends the reply mailbox commands ADD_DC_RESPONSE / RELEASE_DC 2. Add stubs for DC_ADD and DC_RELEASE logic * handle_add_event() stages incoming extents onto mds->add_ctx.pending_extents and, when More=3D0 closes the chain, replies with an empty ADD_DC_RESPONSE =E2=80=94 refusing all extents fo= r now * cxl_rm_extent() acks the release via memdev_release_extent() so the device's view stays consistent; we can ack all releases because we currently don't accept/use any extents offered. 3. Structural setup for later commits: * struct dc_extent, struct cxl_dc_tag_group, and pending_add_ctx set up the stage for the real DC_ADD path, which will enforce tag/grouping semantics Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: restructured from the original "Process dynamic partition events" monolith; this commit lands only the wire-level intake and dispatches the add/release logic to stubbed handlers. The handlers are fleshed out in subsequent commits.] --- drivers/cxl/core/mbox.c | 246 +++++++++++++++++++++++++++++++++++++++- drivers/cxl/cxl.h | 73 +++++++++++- drivers/cxl/cxlmem.h | 45 ++++++++ include/cxl/event.h | 38 +++++++ 4 files changed, 400 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 01b1a318f34f..1b38f34538f3 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -1102,6 +1103,238 @@ static int cxl_clear_event_record(struct cxl_memdev= _state *mds, return rc; } =20 +static int send_one_response(struct cxl_mailbox *cxl_mbox, + struct cxl_mbox_dc_response *response, + int opcode, u32 extent_list_size, u8 flags) +{ + struct cxl_mbox_cmd mbox_cmd =3D (struct cxl_mbox_cmd) { + .opcode =3D opcode, + .size_in =3D struct_size(response, extent_list, extent_list_size), + .payload_in =3D response, + }; + + response->extent_list_size =3D cpu_to_le32(extent_list_size); + response->flags =3D flags; + return cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); +} + +static int cxl_send_dc_response(struct cxl_memdev_state *mds, int opcode, + struct list_head *extent_list, int cnt) +{ + struct cxl_mailbox *cxl_mbox =3D &mds->cxlds.cxl_mbox; + struct cxl_mbox_dc_response *p; + struct cxl_extent_list_node *pos, *tmp; + struct cxl_extent *extent; + u32 pl_index; + + size_t pl_size =3D struct_size(p, extent_list, cnt); + u32 max_extents =3D cnt; + + /* May have to use more bit on response. */ + if (pl_size > cxl_mbox->payload_size) { + max_extents =3D (cxl_mbox->payload_size - sizeof(*p)) / + sizeof(struct updated_extent_list); + pl_size =3D struct_size(p, extent_list, max_extents); + } + + struct cxl_mbox_dc_response *response __free(kfree) =3D + kzalloc(pl_size, GFP_KERNEL); + if (!response) + return -ENOMEM; + + if (cnt =3D=3D 0) + return send_one_response(cxl_mbox, response, opcode, 0, 0); + + pl_index =3D 0; + list_for_each_entry_safe(pos, tmp, extent_list, list) { + extent =3D pos->extent; + response->extent_list[pl_index].dpa_start =3D extent->start_dpa; + response->extent_list[pl_index].length =3D extent->length; + pl_index++; + + if (pl_index =3D=3D max_extents) { + u8 flags =3D 0; + int rc; + + if (pl_index < cnt) + flags |=3D CXL_DCD_EVENT_MORE; + rc =3D send_one_response(cxl_mbox, response, opcode, + pl_index, flags); + if (rc) + return rc; + cnt -=3D pl_index; + if (cnt < max_extents) + max_extents =3D cnt; + pl_index =3D 0; + } + } + + if (!pl_index) /* nothing more to do */ + return 0; + return send_one_response(cxl_mbox, response, opcode, pl_index, 0); +} + +static void delete_extent_node(struct cxl_extent_list_node *node) +{ + list_del(&node->list); + kfree(node->extent); + kfree(node); +} + +static void memdev_release_extent(struct cxl_memdev_state *mds, struct ran= ge *range) +{ + struct device *dev =3D mds->cxlds.dev; + struct cxl_extent_list_node *node; + LIST_HEAD(extent_list); + + dev_dbg(dev, "Release response dpa %pra\n", range); + + node =3D kzalloc(sizeof(*node), GFP_KERNEL); + if (!node) + return; + + node->extent =3D kzalloc(sizeof(*node->extent), GFP_KERNEL); + if (!node->extent) { + kfree(node); + return; + } + + node->extent->start_dpa =3D cpu_to_le64(range->start); + node->extent->length =3D cpu_to_le64(range_len(range)); + list_add_tail(&node->list, &extent_list); + + if (cxl_send_dc_response(mds, CXL_MBOX_OP_RELEASE_DC, &extent_list, 1)) + dev_dbg(dev, "Failed to release %pra\n", range); + + delete_extent_node(node); +} + +static void clear_pending_extents(void *_mds) +{ + struct cxl_memdev_state *mds =3D _mds; + struct cxl_extent_list_node *pos, *tmp; + + list_for_each_entry_safe(pos, tmp, &mds->add_ctx.pending_extents, list) + delete_extent_node(pos); + mds->add_ctx.group =3D NULL; +} + +static int add_to_pending_list(struct list_head *pending_list, + struct cxl_extent *to_add) +{ + struct cxl_extent_list_node *node; + struct cxl_extent *extent; + + node =3D kzalloc(sizeof(*node), GFP_KERNEL); + if (!node) + return -ENOMEM; + extent =3D kmemdup(to_add, sizeof(*extent), GFP_KERNEL); + if (!extent) + return -ENOMEM; + + node->extent =3D extent; + list_add_tail(&node->list, pending_list); + return 0; +} + +/* + * Stub: stage extents on the pending list and reply with an empty + * ADD_DC_RESPONSE on More=3D0 (refuse all). A later commit replaces + * the no-op tail with the real Add pipeline that surfaces a dax + * device per accepted extent. + */ +static int handle_add_event(struct cxl_memdev_state *mds, + struct cxl_event_dcd *event) +{ + struct device *dev =3D mds->cxlds.dev; + int rc; + + rc =3D add_to_pending_list(&mds->add_ctx.pending_extents, &event->extent); + if (rc) + return rc; + + if (event->flags & CXL_DCD_EVENT_MORE) { + dev_dbg(dev, "more bit set; delay the surfacing of extent\n"); + return 0; + } + + rc =3D cxl_send_dc_response(mds, CXL_MBOX_OP_ADD_DC_RESPONSE, + &mds->add_ctx.pending_extents, 0); + clear_pending_extents(mds); + return rc; +} + +/* + * Stub: ack the release back to the device so it knows we are not + * using the range. A later commit replaces this with the real + * teardown that walks the region's tag group and tears down the + * member dc_extent devices. + */ +static int cxl_rm_extent(struct cxl_memdev_state *mds, + struct cxl_extent *extent) +{ + u64 start_dpa =3D le64_to_cpu(extent->start_dpa); + struct range dpa_range =3D { + .start =3D start_dpa, + .end =3D start_dpa + le64_to_cpu(extent->length) - 1, + }; + + memdev_release_extent(mds, &dpa_range); + return 0; +} + +static char *cxl_dcd_evt_type_str(u8 type) +{ + switch (type) { + case DCD_ADD_CAPACITY: + return "add"; + case DCD_RELEASE_CAPACITY: + return "release"; + case DCD_FORCED_CAPACITY_RELEASE: + return "force release"; + default: + break; + } + + return ""; +} + +static void cxl_handle_dcd_event_records(struct cxl_memdev_state *mds, + struct cxl_event_record_raw *raw_rec) +{ + struct cxl_event_dcd *event =3D &raw_rec->event.dcd; + struct cxl_extent *extent =3D &event->extent; + struct device *dev =3D mds->cxlds.dev; + uuid_t *id =3D &raw_rec->id; + int rc; + + if (!uuid_equal(id, &CXL_EVENT_DC_EVENT_UUID)) + return; + + dev_dbg(dev, "DCD event %s : DPA:%#llx LEN:%#llx\n", + cxl_dcd_evt_type_str(event->event_type), + le64_to_cpu(extent->start_dpa), le64_to_cpu(extent->length)); + + switch (event->event_type) { + case DCD_ADD_CAPACITY: + rc =3D handle_add_event(mds, event); + break; + case DCD_RELEASE_CAPACITY: + rc =3D cxl_rm_extent(mds, &event->extent); + break; + case DCD_FORCED_CAPACITY_RELEASE: + dev_err_ratelimited(dev, "Forced release event ignored.\n"); + rc =3D 0; + break; + default: + rc =3D -EINVAL; + break; + } + + if (rc) + dev_err_ratelimited(dev, "dcd event failed: %d\n", rc); +} + static void cxl_mem_get_records_log(struct cxl_memdev_state *mds, enum cxl_event_log_type type) { @@ -1138,9 +1371,13 @@ static void cxl_mem_get_records_log(struct cxl_memde= v_state *mds, if (!nr_rec) break; =20 - for (i =3D 0; i < nr_rec; i++) + for (i =3D 0; i < nr_rec; i++) { __cxl_event_trace_record(cxlmd, type, &payload->records[i]); + if (type =3D=3D CXL_EVENT_TYPE_DCD) + cxl_handle_dcd_event_records(mds, + &payload->records[i]); + } =20 if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW) trace_cxl_overflow(cxlmd, type, payload); @@ -1172,6 +1409,8 @@ void cxl_mem_get_event_records(struct cxl_memdev_stat= e *mds, u32 status) { dev_dbg(mds->cxlds.dev, "Reading event logs: %x\n", status); =20 + if (cxl_dcd_supported(mds) && (status & CXLDEV_EVENT_STATUS_DCD)) + cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_DCD); if (status & CXLDEV_EVENT_STATUS_FATAL) cxl_mem_get_records_log(mds, CXL_EVENT_TYPE_FATAL); if (status & CXLDEV_EVENT_STATUS_FAIL) @@ -1769,6 +2008,11 @@ struct cxl_memdev_state *cxl_memdev_state_create(str= uct device *dev, u64 serial, } =20 mutex_init(&mds->event.log_lock); + INIT_LIST_HEAD(&mds->add_ctx.pending_extents); + + rc =3D devm_add_action_or_reset(dev, clear_pending_extents, mds); + if (rc) + return ERR_PTR(rc); =20 rc =3D devm_cxl_register_mce_notifier(dev, &mds->mce_notifier); if (rc =3D=3D -EOPNOTSUPP) diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 1297594beaec..5ef2cf4d005b 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -12,6 +12,7 @@ #include #include #include +#include #include =20 extern const struct nvdimm_security_ops *cxl_security_ops; @@ -180,11 +181,13 @@ static inline int ways_to_eiw(unsigned int ways, u8 *= eiw) #define CXLDEV_EVENT_STATUS_WARN BIT(1) #define CXLDEV_EVENT_STATUS_FAIL BIT(2) #define CXLDEV_EVENT_STATUS_FATAL BIT(3) +#define CXLDEV_EVENT_STATUS_DCD BIT(4) =20 #define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO | \ CXLDEV_EVENT_STATUS_WARN | \ CXLDEV_EVENT_STATUS_FAIL | \ - CXLDEV_EVENT_STATUS_FATAL) + CXLDEV_EVENT_STATUS_FATAL | \ + CXLDEV_EVENT_STATUS_DCD) =20 /* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */ #define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0) @@ -306,6 +309,41 @@ enum cxl_decoder_state { CXL_DECODER_STATE_AUTO_STAGED, }; =20 +struct cxl_dc_tag_group; + +/** + * struct dc_extent - A single dynamic-capacity extent surfaced to the hos= t. + * + * One per device-stamped extent. Multiple dc_extents that share a tag + * (see &struct cxl_dc_tag_group) form a single logical allocation, but + * each dc_extent has its own HPA range and is the unit that the DAX + * layer sees as a backing dax_resource. + * + * @dev: device representing this extent; child of cxlr_dax->dev. + * @group: containing tag group (allocation); shared across siblings. + * @cxled: endpoint decoder backing the DPA range. + * @dpa_range: DPA range this extent covers within @cxled. + * @hpa_range: HPA range that @dpa_range decodes to, relative to + * cxlr_dax->hpa_range.start. + * @uuid: tag uuid (matches @group->uuid; kept for the release-path log). + * @seq_num: 1..n assembly-order index within the tag group. For extents + * from a sharable partition this equals the device-stamped + * shared_extn_seq (CXL 3.1 Table 8-51). For extents from a + * non-sharable partition the device leaves shared_extn_seq =3D=3D 0 + * and the host assigns @seq_num in event arrival order at + * cxl_add_pending() time. Used by the dax layer to assemble + * ranges in the right order regardless of source. + */ +struct dc_extent { + struct device dev; + struct cxl_dc_tag_group *group; + struct cxl_endpoint_decoder *cxled; + struct range dpa_range; + struct range hpa_range; + uuid_t uuid; + u16 seq_num; +}; + /** * struct cxl_endpoint_decoder - Endpoint / SPA to DPA decoder * @cxld: base cxl_decoder_object @@ -518,12 +556,45 @@ struct cxl_pmem_region { struct cxl_pmem_region_mapping mapping[]; }; =20 +/* See CXL 3.1 8.2.9.2.1.6 */ +enum dc_event { + DCD_ADD_CAPACITY, + DCD_RELEASE_CAPACITY, + DCD_FORCED_CAPACITY_RELEASE, + DCD_REGION_CONFIGURATION_UPDATED, +}; + struct cxl_dax_region { struct device dev; struct cxl_region *cxlr; struct range hpa_range; }; =20 +/** + * struct cxl_dc_tag_group - A tagged dynamic-capacity allocation. + * + * Container for the &struct dc_extent siblings that share a tag. The + * group has no sysfs identity; userspace sees the individual dc_extents + * directly under the parent dax_region device. The group exists to + * keep tag-scoped invariants (atomic add, atomic release, ordered carve + * by seq_num) in one place. + * + * @cxlr_dax: back reference to parent region device. + * @uuid: tag identifying this allocation; same across all member dc_exten= ts. + * @dc_extents: xarray of &struct dc_extent in this group, indexed by the + * dc_extent's @seq_num (1..n, dense). See &struct dc_extent + * for how seq_num is sourced for sharable vs non-sharable + * allocations. + * @nr_extents: live count of dc_extents in the group; the group is freed + * when the last dc_extent device is released. + */ +struct cxl_dc_tag_group { + struct cxl_dax_region *cxlr_dax; + uuid_t uuid; + struct xarray dc_extents; + unsigned int nr_extents; +}; + /** * struct cxl_port - logical collection of upstream port devices and * downstream port devices to construct a CXL memory diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 65c009b02da6..592c8e3b611c 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include "cxl.h" @@ -399,6 +400,23 @@ static inline struct cxl_dev_state *mbox_to_cxlds(stru= ct cxl_mailbox *cxl_mbox) return dev_get_drvdata(cxl_mbox->host); } =20 +/** + * struct pending_add_ctx - Staging state for an in-progress + * DCD_ADD_CAPACITY event chain + * @pending_extents: extents received so far in the chain; flushed when + * the chain closes (More=3D0) + * @group: tag group being assembled from the chain + * + * A DCD_ADD_CAPACITY notification can span multiple event records + * stitched together by the CXL_DCD_EVENT_MORE flag. Records are staged + * here until the device clears More, at which point the staged batch is + * processed and responded to as a single Add_DC_Response. + */ +struct pending_add_ctx { + struct list_head pending_extents; + struct cxl_dc_tag_group *group; +}; + /** * struct cxl_memdev_state - Generic Type-3 Memory Device Class driver data * @@ -417,6 +435,8 @@ static inline struct cxl_dev_state *mbox_to_cxlds(struc= t cxl_mailbox *cxl_mbox) * @active_volatile_bytes: sum of hard + soft volatile * @active_persistent_bytes: sum of hard + soft persistent * @dcd_supported: all DCD commands are supported + * @add_ctx: state for an in-progress DCD_ADD_CAPACITY chain + * (see &struct pending_add_ctx) * @event: event log driver state * @poison: poison driver state info * @security: security driver state info @@ -437,6 +457,7 @@ struct cxl_memdev_state { u64 active_volatile_bytes; u64 active_persistent_bytes; bool dcd_supported; + struct pending_add_ctx add_ctx; =20 struct cxl_event_state event; struct cxl_poison_state poison; @@ -513,6 +534,21 @@ enum cxl_opcode { UUID_INIT(0x5e1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19, \ 0x40, 0x3d, 0x86) =20 +/* + * Add Dynamic Capacity Response + * CXL rev 3.1 section 8.2.9.9.9.3; Table 8-168 & Table 8-169 + */ +struct cxl_mbox_dc_response { + __le32 extent_list_size; + u8 flags; + u8 reserved[3]; + struct updated_extent_list { + __le64 dpa_start; + __le64 length; + u8 reserved[8]; + } __packed extent_list[] __counted_by(extent_list_size); +} __packed; + struct cxl_mbox_get_supported_logs { __le16 entries; u8 rsvd[6]; @@ -583,6 +619,14 @@ struct cxl_mbox_identify { UUID_INIT(0xe71f3a40, 0x2d29, 0x4092, 0x8a, 0x39, 0x4d, 0x1c, 0x96, \ 0x6c, 0x7c, 0x65) =20 +/* + * Dynamic Capacity Event Record + * CXL rev 3.1 section 8.2.9.2.1; Table 8-43 + */ +#define CXL_EVENT_DC_EVENT_UUID = \ + UUID_INIT(0xca95afa7, 0xf183, 0x4018, 0x8c, 0x2f, 0x95, 0x26, 0x8e, \ + 0x10, 0x1a, 0x2a) + /* * Get Event Records output payload * CXL rev 3.0 section 8.2.9.2.2; Table 8-50 @@ -608,6 +652,7 @@ enum cxl_event_log_type { CXL_EVENT_TYPE_WARN, CXL_EVENT_TYPE_FAIL, CXL_EVENT_TYPE_FATAL, + CXL_EVENT_TYPE_DCD, CXL_EVENT_TYPE_MAX }; =20 diff --git a/include/cxl/event.h b/include/cxl/event.h index ff97fea718d2..fa3cd895f656 100644 --- a/include/cxl/event.h +++ b/include/cxl/event.h @@ -6,6 +6,7 @@ #include #include #include +#include =20 /* * Common Event Record Format @@ -141,12 +142,49 @@ struct cxl_event_mem_sparing { u8 reserved2[0x25]; } __packed; =20 +/* + * CXL rev 3.1 section 8.2.9.2.1.6; Table 8-51 + */ +struct cxl_extent { + __le64 start_dpa; + __le64 length; + u8 uuid[UUID_SIZE]; + __le16 shared_extn_seq; + u8 reserved[0x6]; +} __packed; + +struct cxl_extent_list_node { + struct cxl_extent *extent; + struct list_head list; + int rid; +}; + +/* + * Dynamic Capacity Event Record + * CXL rev 3.1 section 8.2.9.2.1.6; Table 8-50 + */ +#define CXL_DCD_EVENT_MORE BIT(0) +struct cxl_event_dcd { + struct cxl_event_record_hdr hdr; + u8 event_type; + u8 validity_flags; + __le16 host_id; + u8 partition_index; + u8 flags; + u8 reserved1[0x2]; + struct cxl_extent extent; + u8 reserved2[0x18]; + __le32 num_avail_extents; + __le32 num_avail_tags; +} __packed; + union cxl_event { struct cxl_event_generic generic; struct cxl_event_gen_media gen_media; struct cxl_event_dram dram; struct cxl_event_mem_module mem_module; struct cxl_event_mem_sparing mem_sparing; + struct cxl_event_dcd dcd; /* dram & gen_media event header */ struct cxl_event_media_hdr media_hdr; } __packed; --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f51.google.com (mail-dl1-f51.google.com [74.125.82.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9981E38736B for ; Sat, 23 May 2026 09:44:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529445; cv=none; b=auEJ01A8i8hqGjdKXg3urnZpjfSZjB7L/IFdMI8AK0PPm38P+L6xdtfWXRZasbZJlW3fmAYUdz/WmOFMVDZmUTJrwfGDi0h2/snm0+aP/0eVfxcDvCdzcN9sbKdrBc2huo2v1pdb0P4/qc10Vtmgyuj5TXOeQgS/sJr++GaRJAU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529445; c=relaxed/simple; bh=n4DApGBlS1c3NISduAouPH4YcExjkm0CRhua22eLWWY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=r+0up2cOtJkAM765N9rUCfvgzy5ayX8TzZb6AjY/PTW3jxFmab9JmKs5rdu7E8HuNIAGfBiU8eMxMe953a5TW7BFO6BPunFLiXKnrQ/Ez0KGkCDk7ndEuZmYj1XmaWhW8WmGoRbtSuMcAUc8DRdn1ND3VdioX2rc28l5z8lE14I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=oJXoomdB; arc=none smtp.client-ip=74.125.82.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oJXoomdB" Received: by mail-dl1-f51.google.com with SMTP id a92af1059eb24-1334825de43so7175037c88.0 for ; Sat, 23 May 2026 02:44:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529443; x=1780134243; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qsDcrfyiYGHg2D0fltIuIiuRF5pesiJjHjMKkWGxFPI=; b=oJXoomdB22ujE9Bpybk7qY2P2lYNyD5AQLxuG0Qxznj4slSREYtcqzwpZbLwituBWW J0G+7PIAFasbRKYwty+gCYjB6FD3PpnlQGmAq1OMrMplOmbpLwyBjZG5NhvYOuAUDrhW DATQVZ2u5oN2ql3b0vt85mfk0NF/sFflvAcMnyYu4cBMz595gShUU+7dmiO5whzXcMX2 muz1TbFGo1sve0/H5Pc8GfIrJKmY7JP2/LlCuPUoxalOnclFpil1yk3SL8nyz+BC8SVM sSWRIl/1Y/vQoXQeJm1ckrUGg/qJyQe6R9XPDtM3TTh92Q6uUpFAaHN/GK5frRBq1ALw nzmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529443; x=1780134243; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qsDcrfyiYGHg2D0fltIuIiuRF5pesiJjHjMKkWGxFPI=; b=YPToqC5jD0Q9zyW/Iigz9N3kXgqUrd3nnj7chuC6a/VkCmbqgZ0pHrerOlqbuNJ5Zy +Si6lE08Q8Rpup8EMzs02wlpjTDEnmT7q1I3nf7Y+LJYD2YBOtuspp8s9CCzvXAyOuSd YjshMUZyXRD/ZRgjTfr52QJLeG2y+wBhK8h4y7m6X8Q/GCteOlj7N9qg6G4jjY6zvumW rKLIixQlepDrlZR9JSn4KWlAuMeyXcm/9pzaGSA7l1kypdQWC6hfQB/1KJzs3kkBRNSw ZE2kdwdtiAiIdP+bZjWSXxcz72n6Io0GmyXP1oz7sETroVm88+4FfIiB7DNld3wu1EMD WixQ== X-Forwarded-Encrypted: i=1; AFNElJ9FcOqk2yM6omMF0G1wif/C6VwTeQeTEhzBQzWtbVhnPdwtexLFbzpsiipmphKTDZJWgjvdKpKoftHgsJw=@vger.kernel.org X-Gm-Message-State: AOJu0Yy3JFjwWwddnMv5WH+Deq9TSyJRFEfGvJJGcdtbll/o738IeOf3 GQkZU5HTfXX71zfy7Z7nhi/Z0qmUr7OrQ+JonkBKq+JXTj/JwGYkDLt9 X-Gm-Gg: Acq92OFBQ+uDqr6e7XucXhuULvRi+NzDihb51H7cSFikGT0vt6sg/yRibRXkwotGN8e JV7x3shRUYNkwcah3XURmTbxaHrAIo3R/jVSuT7Zkzhl11qZkjiVfYFahtUCA2doDnqm24Le1fK l7TokXVp2wLmkLhQH4XwJ5e8tYSk+Jzm96eER/Xl52uVeWb/lLSKvmnflCBGPnvPON6yW0PJwfb sFgF8QqSutWlgu2cuTdJ0Siv+L63tHLlMSBOWUbQ/yaGTFp9zx/bS8gGlw/3VQAVNyPXFWCqS5X cAlEK9MqZgwtpqPg7GHYttr1Aa7dTD9SxgbbPgBCcQ4EU5UNGVwcV+svqHupJL5LFtlnm4IGgdB rK1XCBIcKleat1yHubVwYeGpuZGxO/B+EhiPnQw9JIPdXGgMsZt0IUKxroJ/AYTodumWMNI8Rj5 VYBPI/KasK/j/GQaBLo6k2mFpVJeJhXAgXehhdXS5JRzQchw1jOLX3y90K5cNQY2LiXmhP+oKwx g1UYq1PdNCL3ADHFQ== X-Received: by 2002:a05:7022:218:b0:12a:6a64:81ee with SMTP id a92af1059eb24-1365f5f2f3amr2607953c88.3.1779529442484; Sat, 23 May 2026 02:44:02 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:02 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su Subject: [PATCH v10 13/31] cxl/mem: Add 20 second timeout for stalled DC_ADD_CAPACITY chains Date: Sat, 23 May 2026 02:43:07 -0700 Message-ID: <68caa60e758cb8ad5c9d0870cace911829ac965d.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable A DC_ADD_CAPACITY event can span multiple event records grouped together by the CXL_DCD_EVENT_MORE flag. Extents are staged in the pending list until the last event record ('More'=3D0) is received, at which point the pending list is processed. If the device opens such a chain (More=3D1) but never sends the closing record, the staged list sits indefinitely. Add a delayed-work watchdog that, on expiry, refuses the chain with an empty ADD_DC_RESPONSE and drops the staged list. The 20s timeout is a conservative upper bound and may be tightened later. The timeout is purely defensive =E2=80=94 the spec does not require = it, but prevents issues from a lost mailbox response or a crashed fabric manage= r. Signed-off-by: Anisa Su --- drivers/cxl/core/mbox.c | 73 ++++++++++++++++++++++++++++++++++++++++- drivers/cxl/cxlmem.h | 23 ++++++++++--- 2 files changed, 91 insertions(+), 5 deletions(-) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 1b38f34538f3..c376492fa166 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1219,6 +1219,48 @@ static void clear_pending_extents(void *_mds) mds->add_ctx.group =3D NULL; } =20 +/* + * Bound on how long the host will wait for a device to finish a + * multi-record DC_ADD_CAPACITY chain (More=3D1 ... More=3D0) before + * refusing the chain. + * The timeout is not defined in the spec, but added for defensive purpose= s. + * Since there is no spec-defined timeout, 20s is chosen as a generous + * upper bound and matches the GPF timeout. + */ +#define CXL_DC_ADD_TIMEOUT (20 * HZ) + +static void cxl_dc_add_timeout(struct work_struct *work) +{ + struct pending_add_ctx *ctx =3D container_of(to_delayed_work(work), + struct pending_add_ctx, + timeout_work); + struct cxl_memdev_state *mds =3D container_of(ctx, + struct cxl_memdev_state, + add_ctx); + struct device *dev =3D mds->cxlds.dev; + + guard(mutex)(&ctx->lock); + + if (!ctx->armed) + return; + + dev_warn(dev, "DC add chain timed out; refusing staged extents\n"); + + if (cxl_send_dc_response(mds, CXL_MBOX_OP_ADD_DC_RESPONSE, + &ctx->pending_extents, 0)) + dev_dbg(dev, "Failed to send empty ADD_DC_RESPONSE on timeout\n"); + + clear_pending_extents(mds); + ctx->armed =3D false; +} + +static void cxl_cancel_dcd_add_chain_work(void *_mds) +{ + struct cxl_memdev_state *mds =3D _mds; + + cancel_delayed_work_sync(&mds->add_ctx.timeout_work); +} + static int add_to_pending_list(struct list_head *pending_list, struct cxl_extent *to_add) { @@ -1246,18 +1288,34 @@ static int add_to_pending_list(struct list_head *pe= nding_list, static int handle_add_event(struct cxl_memdev_state *mds, struct cxl_event_dcd *event) { + struct pending_add_ctx *ctx =3D &mds->add_ctx; struct device *dev =3D mds->cxlds.dev; int rc; =20 - rc =3D add_to_pending_list(&mds->add_ctx.pending_extents, &event->extent); + guard(mutex)(&ctx->lock); + + rc =3D add_to_pending_list(&ctx->pending_extents, &event->extent); if (rc) return rc; =20 if (event->flags & CXL_DCD_EVENT_MORE) { dev_dbg(dev, "more bit set; delay the surfacing of extent\n"); + mod_delayed_work(system_wq, &ctx->timeout_work, + CXL_DC_ADD_TIMEOUT); + ctx->armed =3D true; return 0; } =20 + /* + * Chain is closing. Disarm before flushing so a pending watchdog + * (queued but blocked on @ctx->lock) sees !armed and bails out. + * cancel_delayed_work() =E2=80=94 not _sync =E2=80=94 because handle_add= _event() + * itself runs on system_wq and a sync cancel of same-wq work can + * deadlock. + */ + ctx->armed =3D false; + cancel_delayed_work(&ctx->timeout_work); + rc =3D cxl_send_dc_response(mds, CXL_MBOX_OP_ADD_DC_RESPONSE, &mds->add_ctx.pending_extents, 0); clear_pending_extents(mds); @@ -2009,11 +2067,24 @@ struct cxl_memdev_state *cxl_memdev_state_create(st= ruct device *dev, u64 serial, =20 mutex_init(&mds->event.log_lock); INIT_LIST_HEAD(&mds->add_ctx.pending_extents); + mutex_init(&mds->add_ctx.lock); + INIT_DELAYED_WORK(&mds->add_ctx.timeout_work, + cxl_dc_add_timeout); + mds->add_ctx.armed =3D false; =20 rc =3D devm_add_action_or_reset(dev, clear_pending_extents, mds); if (rc) return ERR_PTR(rc); =20 + /* + * Registered after clear_pending_extents so devm's reverse-order + * unwind cancels (and waits for) the watchdog first, then the list + * cleanup runs with the watchdog guaranteed not to refire. + */ + rc =3D devm_add_action_or_reset(dev, cxl_cancel_dcd_add_chain_work, mds); + if (rc) + return ERR_PTR(rc); + rc =3D devm_cxl_register_mce_notifier(dev, &mds->mce_notifier); if (rc =3D=3D -EOPNOTSUPP) dev_warn(dev, "CXL MCE unsupported\n"); diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 592c8e3b611c..d992cc9b7811 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -8,6 +8,8 @@ #include #include #include +#include +#include #include #include #include "cxl.h" @@ -402,19 +404,32 @@ static inline struct cxl_dev_state *mbox_to_cxlds(str= uct cxl_mailbox *cxl_mbox) =20 /** * struct pending_add_ctx - Staging state for an in-progress - * DCD_ADD_CAPACITY event chain + * DCD_ADD_CAPACITY event chain * @pending_extents: extents received so far in the chain; flushed when - * the chain closes (More=3D0) + * the chain closes (More=3D0) * @group: tag group being assembled from the chain + * @timeout_work: watchdog that fires if a chain is opened with + * CXL_DCD_EVENT_MORE but the closing record never arrives + * @lock: serialises updates to the chain state against the watchdog + * @armed: set when a More=3D1 chain opens; cleared when the chain closes, + * either by a More=3D0 event record or by the watchdog firing. * * A DCD_ADD_CAPACITY notification can span multiple event records * stitched together by the CXL_DCD_EVENT_MORE flag. Records are staged - * here until the device clears More, at which point the staged batch is - * processed and responded to as a single Add_DC_Response. + * here until an event record with 'More'=3D0 is received, at which point = the + * staged batch is processed and responded to as a single Add_DC_Response. + * + * If a chain is opened (More=3D1) but the device never sends the closing + * record, the staged list would otherwise sit indefinitely. @timeout_work + * is a defensive watchdog that refuses such a chain with an empty response + * and drops the staged list. */ struct pending_add_ctx { struct list_head pending_extents; struct cxl_dc_tag_group *group; + struct delayed_work timeout_work; + struct mutex lock; + bool armed; }; =20 /** --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f47.google.com (mail-dl1-f47.google.com [74.125.82.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3AF2386C31 for ; Sat, 23 May 2026 09:44:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529448; cv=none; b=Pay0VngRda4SowGIAPvDh3wenagnkyJop2ldUx0B8S0r6u/wGeg84JcwiwDSO0o31p1op5oTNNdEJxo9NbD36W9yt+E0Vsh+9u/YP+xn2+DuFxYdgqUWbGjzsGY/IfQnDi2lsEuLv3/GSncNoElWQBAWx3mh672mRckPbA5TQnM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529448; c=relaxed/simple; bh=R49tJC73z5W7Jf8ONCPvTNZLKPmMr6Pdy7hHV8pugfc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NWf2t4s34PcLEHNnMbYvdNJrAnUraHTQj+4fTVIoNO0EqP/Zl9nJ49N7on6KD74hADI2vTTwi1R6MiZvAx3t3t0a5MyZL4EwbJq/xLvS7zoqBewWEwclkSp3NczSEIxt8nnpjCU3DL1kMH1Qi0oshYRKBgxRLRwaysiJ/tTUFA4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=K1MO/vd6; arc=none smtp.client-ip=74.125.82.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="K1MO/vd6" Received: by mail-dl1-f47.google.com with SMTP id a92af1059eb24-13663f68983so1944244c88.0 for ; Sat, 23 May 2026 02:44:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529444; x=1780134244; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SnNjExx+uwok7hjdnj3d6W7x3+m9v/YyStc0Sn+dkuk=; b=K1MO/vd60eptlHLZD8UymvTCtJ83k7sYm+a84HpE+E7QprLz0K6kJx8gq3/sHfl9UH HK3mHvFmXlxWh9XQbZvtajBDiaeXMB16GyK6tDJDXOGF289Dp70Pooi5zq8roqDZu/n+ issKwWgrOV3rfon0dIk8xTvuWFCi2BJlBG0b0DqOTAv0FeNGItHbaFVKD9H5FBn6r+3u H3AWrbLgOtt+HecHFNpgkAXJ9BiOkoPepNOLv+nDPkbMXukV1wh2/zihbI/E59tgrW7p Td3b758eDPBjkQzuv+lCr21IEpWFs/YanAbTYjJPrEify7j4cf6Igbkyh9dTmPGyyP3B aByg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529444; x=1780134244; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=SnNjExx+uwok7hjdnj3d6W7x3+m9v/YyStc0Sn+dkuk=; b=B/jeuDOlxiKygVLvk8QudYjMHjh1ui7ARypF8/arPi+iwCeeMRpKMEKfsCkaAi56mi 9oYfK9JPPjYOhUw3cG1RF/RsjVikj/epF+lGOf30REdDLmLqAih0yMpeXFwKzSBCKFEh ENlIQjGn23XpfpKzxLmx9YVrNWmGcE0U69UMb8QIWfGqMrSZsElOYRGFQodKeTWsclWr eqUyuy+BD1BJnxqBYzhz9IM/VrWCvjIVbkZzaseRSZO3S+NV7KXGNu7B+JlTuJaw7Un6 90JMgytpzOcaxDcildnb6WHx4Jl3dEr1vwQ+bqNii3DxRKwAMTjgtQW/CuVn00hd1wBr uPQA== X-Forwarded-Encrypted: i=1; AFNElJ9kbMnhdZa2OUxfqTGZQZu0U5a8vyhkWJStBQo05C82+ujregEubZjnTWOXgGZT/a+tOhiihcJ46OjAKoU=@vger.kernel.org X-Gm-Message-State: AOJu0Yw2KnBeLxX34VPzPkZhGaJgJCA9Xk3lSMEdrIJZqj/RQCwN/Kzv sfNV2r+XJETFHKokHt4jAGyff5lQ+PouQZDmg8YaXLkecShYW7dOyzip X-Gm-Gg: Acq92OFAkMmIqU9YKOiReQ8q7FL9g2lPQZ1C1Xl4/H4vAfUm7zepLO6MkM7fBbQo5HF Cpf+euOBpJP2qAxmgslPNISUiBnY0abtiZuCgpvHVFnmARvXPXHbwH0jAjc3zDrsXulMq/+XV3o u+/2k9kW2cnaSR4EeubcMEAhqeYqd+5pVrrPBbVMc8W1+w0FKVH9zxHCIcOMmYF+QEXDppVMRe6 o7J3oGrS0NIsRkK0oF3y0iyntw+oAQf/XwQLp3bu8NwnTQh1Rda3lq2MxPRmI/a81Mizj1GJbhr FtUkKke+YNrtS0BvvkjHcyrvyyQQIGQczTjZc+SSjr4nHAK0ZKYP2QpHu9MGHnSZj016AXK+41Z psBVSKgc9a7sPTgfrWfZi+Xswzo+onjvyHNHvB/D4ReXqNcWmPLnkiTOGnUtkqu5OZApyx8YhIG f2rsW6tajwQcepXf0Ze48G7uSkIJL9wzcKo6xTmxMD/k/ZxuuRps/hPiUFCKs7N8liFBprUSskI fYJGh7SRzGn62l/KQ== X-Received: by 2002:a05:701b:240f:b0:132:f27:5302 with SMTP id a92af1059eb24-1365f5fe8fcmr1479722c88.3.1779529444026; Sat, 23 May 2026 02:44:04 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:03 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 14/31] cxl/extent: Handle DC Add Capacity events Date: Sat, 23 May 2026 02:43:08 -0700 Message-ID: <22f480966589928b457ed34ee291161c8cf5af75.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Replace the empty-response stub in handle_add_event() with the real add pipeline. DC Event Records can be grouped together with the 'More' flag. The previous commit completed the set up for holding onto extents in the pending list until receiving the last event record of the group, marked by 'More'=3D0. This commit fills in the logic for processing the pending list and adds basic validation for extents before they are added to the device model as a child of the cxlr_dax region. More complete checks for tags/sequence numbers/alignment is added in subsequent commits. For each tag that appears in the pending list: 1. Extract all extents in the pending list with that tag to a local list. 2. The spec requires that shareable extents are ordered by shared extent sequence number, which "instructs each host on the relative order these extents must be placed in adjacent virtual address space" (r4.0 Section 9.13.3 Figure 9-23 Shared Extent List Example). Otherwise, retain arrival order. Thus the tag group is stable-sorted by shared_extn_seq; for non-sharable extents every key is 0 and the stable sort preserves arrival order. Individual extents are checked for the following: 1. The extent's DPA range fully resolves to an endpoint decoder. 2. Doesn't overlap with a previously accepted extent. 3. Sequence number doesn't collide with others in the same tag group Upon passing these checks, extents are "onlined" together as a tag group: online_tag_group() registers a struct device per dc_extent under cxlr_dax->dev so the dax layer can discover them via device_for_each_child(). Once the pending list has been fully processed, send the DC_ADD_RESPONSE. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: restructured from the original "Process dynamic partition events" monolith; this commit fills in the Add path on top of the previous commit's stubs. Further validation lands in subsequent commits.] --- drivers/cxl/core/Makefile | 2 +- drivers/cxl/core/core.h | 13 ++ drivers/cxl/core/extent.c | 372 ++++++++++++++++++++++++++++++++++ drivers/cxl/core/mbox.c | 123 ++++++++++- drivers/cxl/core/region_dax.c | 3 + drivers/cxl/cxl.h | 19 ++ tools/testing/cxl/Kbuild | 5 +- 7 files changed, 528 insertions(+), 9 deletions(-) create mode 100644 drivers/cxl/core/extent.c diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index ce7213818d3c..208917ad8aac 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -15,7 +15,7 @@ cxl_core-y +=3D hdm.o cxl_core-y +=3D pmu.o cxl_core-y +=3D cdat.o cxl_core-$(CONFIG_TRACING) +=3D trace.o -cxl_core-$(CONFIG_CXL_REGION) +=3D region.o region_pmem.o region_dax.o +cxl_core-$(CONFIG_CXL_REGION) +=3D region.o region_pmem.o region_dax.o ext= ent.o cxl_core-$(CONFIG_CXL_MCE) +=3D mce.o cxl_core-$(CONFIG_CXL_FEATURES) +=3D features.o cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) +=3D edac.o diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 14723cfd05f0..1bae80dbf991 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -65,12 +65,24 @@ u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struc= t cxl_memdev *cxlmd, int devm_cxl_add_dax_region(struct cxl_region *cxlr); int devm_cxl_add_pmem_region(struct cxl_region *cxlr); =20 +int cxl_add_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent, + u16 seq_num); +int online_tag_group(struct cxl_dc_tag_group *group); #else static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, u64 dpa) { return ULLONG_MAX; } +static inline int cxl_add_extent(struct cxl_memdev_state *mds, + struct cxl_extent *extent, u16 seq_num) +{ + return 0; +} +static inline int online_tag_group(struct cxl_dc_tag_group *group) +{ + return 0; +} static inline struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa, struct cxl_endpoint_decoder **cxled) @@ -166,6 +178,7 @@ long cxl_pci_get_latency(struct pci_dev *pdev); int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct access_coordinate *= c); int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port, struct access_coordinate *c); +void memdev_release_extent(struct cxl_memdev_state *mds, struct range *ran= ge); =20 static inline struct device *port_to_host(struct cxl_port *port) { diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c new file mode 100644 index 000000000000..94128d06f4ed --- /dev/null +++ b/drivers/cxl/core/extent.c @@ -0,0 +1,372 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2024 Intel Corporation. All rights reserved. */ + +#include +#include + +#include "core.h" + + +static void cxled_release_extent(struct cxl_endpoint_decoder *cxled, + struct dc_extent *dc_extent) +{ + struct cxl_memdev_state *mds =3D cxled_to_mds(cxled); + struct device *dev =3D &cxled->cxld.dev; + + dev_dbg(dev, "Remove extent %pra (%pU)\n", + &dc_extent->dpa_range, &dc_extent->uuid); + memdev_release_extent(mds, &dc_extent->dpa_range); +} + +static void free_tag_group(struct cxl_dc_tag_group *group) +{ + xa_destroy(&group->dc_extents); + kfree(group); +} + +static void dc_extent_release(struct device *dev) +{ + struct dc_extent *dc_extent =3D to_dc_extent(dev); + struct cxl_dc_tag_group *group =3D dc_extent->group; + + cxled_release_extent(dc_extent->cxled, dc_extent); + xa_erase(&group->cxlr_dax->dc_extents, dc_extent->dev.id); + xa_erase(&group->dc_extents, dc_extent->seq_num); + group->nr_extents--; + if (!group->nr_extents) + free_tag_group(group); + kfree(dc_extent); +} + +static const struct device_type dc_extent_type =3D { + .name =3D "extent", + .release =3D dc_extent_release, +}; + +bool is_dc_extent(struct device *dev) +{ + return dev->type =3D=3D &dc_extent_type; +} +EXPORT_SYMBOL_NS_GPL(is_dc_extent, "CXL"); + +static struct cxl_dc_tag_group * +alloc_tag_group(struct cxl_dax_region *cxlr_dax, uuid_t *uuid) +{ + struct cxl_dc_tag_group *group __free(kfree) =3D + kzalloc(sizeof(*group), GFP_KERNEL); + if (!group) + return ERR_PTR(-ENOMEM); + + group->cxlr_dax =3D cxlr_dax; + uuid_copy(&group->uuid, uuid); + xa_init(&group->dc_extents); + return no_free_ptr(group); +} + +/* + * Stage 1 of the add pipeline: pure, no allocation. Resolve the extent + * to its region/endpoint decoder and ext_range, and verify the range + * fits in the resolved endpoint decoder's DPA resource. Further + * per-extent invariants layer into this function in subsequent commits. + * + * Caller must hold cxl_rwsem.region for read (cxl_dpa_to_region()). + * On success, @out_cxled / @out_cxlr_dax / @out_ext_range carry the + * resolved handles consumed by the rest of the pipeline. + */ +static int cxl_validate_extent(struct cxl_memdev_state *mds, + struct cxl_extent *extent, + struct cxl_endpoint_decoder **out_cxled, + struct cxl_dax_region **out_cxlr_dax, + struct range *out_ext_range) +{ + u64 start_dpa =3D le64_to_cpu(extent->start_dpa); + struct cxl_memdev *cxlmd =3D mds->cxlds.cxlmd; + struct cxl_endpoint_decoder *cxled; + struct cxl_region *cxlr; + struct range ext_range =3D (struct range) { + .start =3D start_dpa, + .end =3D start_dpa + le64_to_cpu(extent->length) - 1, + }; + struct range ed_range; + + cxlr =3D cxl_dpa_to_region(cxlmd, start_dpa, &cxled); + if (!cxlr) + return -ENXIO; + + ed_range =3D (struct range) { + .start =3D cxled->dpa_res->start, + .end =3D cxled->dpa_res->end, + }; + if (!range_contains(&ed_range, &ext_range)) { + dev_err_ratelimited(&cxled->cxld.dev, + "DC extent DPA %pra (%pU) is not fully in ED %pra\n", + &ext_range, extent->uuid, &ed_range); + return -ENXIO; + } + + *out_cxled =3D cxled; + *out_cxlr_dax =3D cxlr->cxlr_dax; + *out_ext_range =3D ext_range; + return 0; +} + +enum cxl_extent_class { + CXL_EXT_NEW, + CXL_EXT_DUPLICATE, + CXL_EXT_OVERLAP, +}; + +/* + * Stage 2: classify @ext_range against extents already accepted on this + * cxlr_dax+cxled. Walks cxlr_dax->dc_extents once: a stored extent that + * fully contains @ext_range means a duplicate accept (idempotent, fine); + * a stored extent that only overlaps means an inconsistent offer. + */ +static enum cxl_extent_class +cxlr_dax_classify_extent(struct cxl_dax_region *cxlr_dax, + struct cxl_endpoint_decoder *cxled, + const struct range *ext_range) +{ + struct dc_extent *entry; + unsigned long i; + + xa_for_each(&cxlr_dax->dc_extents, i, entry) { + if (entry->cxled !=3D cxled) + continue; + if (range_contains(&entry->dpa_range, ext_range)) + return CXL_EXT_DUPLICATE; + if (range_overlaps(&entry->dpa_range, ext_range)) + return CXL_EXT_OVERLAP; + } + return CXL_EXT_NEW; +} + +/* + * Stage 3: allocate and populate a dc_extent for an already-validated, + * already-classified-as-new @ext_range. Only -ENOMEM can fail here. + */ +static struct dc_extent * +dc_extent_build(struct cxl_endpoint_decoder *cxled, + struct cxl_dax_region *cxlr_dax, + struct cxl_extent *extent, + const struct range *ext_range, u16 seq_num) +{ + resource_size_t dpa_offset =3D ext_range->start - cxled->dpa_res->start; + resource_size_t hpa =3D cxled->cxld.hpa_range.start + dpa_offset; + struct dc_extent *dc_extent; + + dc_extent =3D kzalloc(sizeof(*dc_extent), GFP_KERNEL); + if (!dc_extent) + return ERR_PTR(-ENOMEM); + + dc_extent->cxled =3D cxled; + dc_extent->dpa_range =3D *ext_range; + dc_extent->hpa_range.start =3D hpa - cxlr_dax->hpa_range.start; + dc_extent->hpa_range.end =3D dc_extent->hpa_range.start + + range_len(ext_range) - 1; + dc_extent->seq_num =3D seq_num; + import_uuid(&dc_extent->uuid, extent->uuid); + return dc_extent; +} + +/* + * Stage 4: insert @dc_extent into the pending tag group. All extents in + * one More-chain group share a UUID =E2=80=94 enforced here as the group = is + * either being created (first extent) or appended to. On any failure + * the dc_extent is freed. + */ +static int cxlr_add_extent(struct cxl_memdev_state *mds, + struct cxl_dax_region *cxlr_dax, + struct dc_extent *dc_extent) +{ + struct cxl_dc_tag_group **group =3D &mds->add_ctx.group; + int rc; + + if (*group && !uuid_equal(&(*group)->uuid, &dc_extent->uuid)) { + kfree(dc_extent); + return -EINVAL; + } + + if (!*group) { + dev_dbg(&cxlr_dax->dev, "Alloc new tag group\n"); + *group =3D alloc_tag_group(cxlr_dax, &dc_extent->uuid); + if (IS_ERR(*group)) { + rc =3D PTR_ERR(*group); + *group =3D NULL; + kfree(dc_extent); + return rc; + } + } else { + dev_dbg(&cxlr_dax->dev, "Append dc_extent to tag group\n"); + } + + dc_extent->group =3D *group; + + /* + * Key by @seq_num so iteration order equals assembly order, in both + * the sharable case (device-stamped 1..n) and the non-sharable case + * (host-assigned arrival-order 1..n). A collision here signals a + * cxl-side validation gap. + */ + rc =3D xa_insert(&(*group)->dc_extents, dc_extent->seq_num, + dc_extent, GFP_KERNEL); + if (rc) { + dev_WARN_ONCE(&cxlr_dax->dev, rc =3D=3D -EBUSY, + "duplicate seq_num %u in tag %pUb\n", + dc_extent->seq_num, &dc_extent->uuid); + kfree(dc_extent); + return rc; + } + + return 0; +} + +int cxl_add_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent, + u16 seq_num) +{ + struct cxl_endpoint_decoder *cxled; + struct cxl_dax_region *cxlr_dax; + struct dc_extent *dc_extent; + struct range ext_range; + int rc; + + guard(rwsem_read)(&cxl_rwsem.region); + + rc =3D cxl_validate_extent(mds, extent, &cxled, &cxlr_dax, &ext_range); + if (rc) + return rc; + + switch (cxlr_dax_classify_extent(cxlr_dax, cxled, &ext_range)) { + case CXL_EXT_DUPLICATE: + /* + * Idempotent accept simplifies the dax-side scan for existing + * extents on region creation; reply success without duplicating. + */ + dev_warn_ratelimited(&cxled->cxld.dev, + "Extent %pra exists; accept again\n", + &ext_range); + return 0; + case CXL_EXT_OVERLAP: + return -ENXIO; + case CXL_EXT_NEW: + break; + } + + dc_extent =3D dc_extent_build(cxled, cxlr_dax, extent, &ext_range, + seq_num); + if (IS_ERR(dc_extent)) + return PTR_ERR(dc_extent); + + dev_dbg(&cxled->cxld.dev, "Add extent %pra (%pU)\n", + &dc_extent->dpa_range, &dc_extent->uuid); + + return cxlr_add_extent(mds, cxlr_dax, dc_extent); +} + +static void dc_extent_unregister(void *ext) +{ + struct dc_extent *dc_extent =3D ext; + + dev_dbg(&dc_extent->dev, "DAX region rm extent HPA %pra\n", + &dc_extent->hpa_range); + device_unregister(&dc_extent->dev); +} + +static void cleanup_pending_dc_extent(struct dc_extent *dc_extent) +{ + struct cxl_dc_tag_group *group =3D dc_extent->group; + + cxled_release_extent(dc_extent->cxled, dc_extent); + xa_erase(&group->dc_extents, dc_extent->seq_num); + group->nr_extents--; + if (!group->nr_extents) + free_tag_group(group); + kfree(dc_extent); +} + +int online_tag_group(struct cxl_dc_tag_group *group) +{ + struct cxl_dax_region *cxlr_dax =3D group->cxlr_dax; + struct dc_extent *dc_extent; + unsigned long index; + int rc =3D 0; + + /* + * Seed nr_extents with the full group size plus a +1 pin held by + * this function. The size counts every dc_extent that might + * decrement nr_extents on cleanup; the pin keeps @group alive + * across the body even if every dc_extent release fires inside + * the loop (e.g. devm_add_action_or_reset failure on the only + * pending extent). The pin is dropped at the end of the function. + */ + xa_for_each(&group->dc_extents, index, dc_extent) + group->nr_extents++; + group->nr_extents++; + + xa_for_each(&group->dc_extents, index, dc_extent) { + struct device *dev =3D &dc_extent->dev; + u32 id; + + device_initialize(dev); + device_set_pm_not_required(dev); + dev->parent =3D &cxlr_dax->dev; + dev->type =3D &dc_extent_type; + + rc =3D xa_alloc(&cxlr_dax->dc_extents, &id, dc_extent, + xa_limit_32b, GFP_KERNEL); + if (rc < 0) { + put_device(dev); + break; + } + dev->id =3D id; + + rc =3D dev_set_name(dev, "extent%d.%d", cxlr_dax->cxlr->id, + dev->id); + if (rc) { + xa_erase(&cxlr_dax->dc_extents, dev->id); + put_device(dev); + break; + } + + rc =3D device_add(dev); + if (rc) { + xa_erase(&cxlr_dax->dc_extents, dev->id); + put_device(dev); + break; + } + + dev_dbg(dev, "dc_extent HPA %pra (%pU)\n", + &dc_extent->hpa_range, &group->uuid); + + rc =3D devm_add_action_or_reset(&cxlr_dax->dev, + dc_extent_unregister, dc_extent); + if (rc) + break; + } + + if (rc) { + /* + * Unwind every remaining dc_extent in the group. The pin + * above keeps @group alive across this walk. Distinguish + * onlined dc_extents (have a devm action) from pending ones + * via devm_remove_action_nowarn(): a 0 return means the + * action was installed and is now consumed, so we run the + * unregister ourselves; -ENOENT means pending. + */ + xa_for_each(&group->dc_extents, index, dc_extent) { + int r =3D devm_remove_action_nowarn(&cxlr_dax->dev, + dc_extent_unregister, + dc_extent); + if (r =3D=3D 0) + dc_extent_unregister(dc_extent); + else + cleanup_pending_dc_extent(dc_extent); + } + } + + /* Drop the pin; if nothing else still references @group, free it. */ + group->nr_extents--; + if (!group->nr_extents) + free_tag_group(group); + return rc; +} diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index c376492fa166..e5edc3975e8f 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -1181,7 +1182,7 @@ static void delete_extent_node(struct cxl_extent_list= _node *node) kfree(node); } =20 -static void memdev_release_extent(struct cxl_memdev_state *mds, struct ran= ge *range) +void memdev_release_extent(struct cxl_memdev_state *mds, struct range *ran= ge) { struct device *dev =3D mds->cxlds.dev; struct cxl_extent_list_node *node; @@ -1280,11 +1281,120 @@ static int add_to_pending_list(struct list_head *p= ending_list, } =20 /* - * Stub: stage extents on the pending list and reply with an empty - * ADD_DC_RESPONSE on More=3D0 (refuse all). A later commit replaces - * the no-op tail with the real Add pipeline that surfaces a dax - * device per accepted extent. + * Compare two extents by shared_extn_seq (ascending). list_sort is + * stable so when shared_extn_seq is 0 for every entry (non-sharable + * partition) ties fall back to arrival order via list_add_tail() in + * add_to_pending_list(). */ +static int extent_seq_compare(void *priv, + const struct list_head *a, + const struct list_head *b) +{ + const struct cxl_extent_list_node *ea =3D + list_entry(a, struct cxl_extent_list_node, list); + const struct cxl_extent_list_node *eb =3D + list_entry(b, struct cxl_extent_list_node, list); + u16 sa =3D le16_to_cpu(ea->extent->shared_extn_seq); + u16 sb =3D le16_to_cpu(eb->extent->shared_extn_seq); + + if (sa < sb) + return -1; + if (sa > sb) + return 1; + return 0; +} + +/* + * Move every pending extent whose tag matches @tag onto @group, preserving + * the order they appear in @pending. + */ +static void extract_tag_group(struct list_head *pending, + const uuid_t *tag, + struct list_head *group) +{ + struct cxl_extent_list_node *pos, *tmp; + + list_for_each_entry_safe(pos, tmp, pending, list) { + uuid_t t; + + import_uuid(&t, pos->extent->uuid); + if (uuid_equal(&t, tag)) + list_move_tail(&pos->list, group); + } +} + +/* + * Drive the pending Add-Capacity records through cxl_add_extent(), + * grouped by tag. Per group: extract from pending, stable-sort by + * shared_extn_seq, then attempt to add each extent. Online the tag + * group via online_tag_group() once all of its extents have been + * realized. Validation gates layer onto this loop in later commits. + */ +static int cxl_add_pending(struct cxl_memdev_state *mds) +{ + struct device *dev =3D mds->cxlds.dev; + struct list_head *pending =3D &mds->add_ctx.pending_extents; + struct cxl_extent_list_node *pos, *tmp; + LIST_HEAD(accepted); + int total_accepted =3D 0; + + while (!list_empty(pending)) { + LIST_HEAD(group); + struct cxl_dc_tag_group *tag_group; + int group_cnt =3D 0; + uuid_t tag; + int rc; + + import_uuid(&tag, + list_first_entry(pending, + struct cxl_extent_list_node, + list)->extent->uuid); + extract_tag_group(pending, &tag, &group); + list_sort(NULL, &group, extent_seq_compare); + + u16 logical_seq =3D 1; + list_for_each_entry_safe(pos, tmp, &group, list) { + u16 raw =3D le16_to_cpu(pos->extent->shared_extn_seq); + u16 seq =3D raw ? raw : logical_seq; + + logical_seq++; + + if (cxl_add_extent(mds, pos->extent, seq)) { + dev_dbg(dev, + "Tag %pUb: failed to add extent DPA:%#llx LEN:%#llx\n", + &tag, + le64_to_cpu(pos->extent->start_dpa), + le64_to_cpu(pos->extent->length)); + delete_extent_node(pos); + continue; + } + group_cnt++; + } + + tag_group =3D mds->add_ctx.group; + if (!tag_group) + continue; + + rc =3D online_tag_group(tag_group); + if (rc) { + dev_warn(dev, + "Tag %pUb: failed to online tag group (%d)\n", + &tag, rc); + list_for_each_entry_safe(pos, tmp, &group, list) + delete_extent_node(pos); + } else { + list_splice_tail_init(&group, &accepted); + total_accepted +=3D group_cnt; + } + + mds->add_ctx.group =3D NULL; + } + + list_splice(&accepted, pending); + return cxl_send_dc_response(mds, CXL_MBOX_OP_ADD_DC_RESPONSE, + pending, total_accepted); +} + static int handle_add_event(struct cxl_memdev_state *mds, struct cxl_event_dcd *event) { @@ -1316,8 +1426,7 @@ static int handle_add_event(struct cxl_memdev_state *= mds, ctx->armed =3D false; cancel_delayed_work(&ctx->timeout_work); =20 - rc =3D cxl_send_dc_response(mds, CXL_MBOX_OP_ADD_DC_RESPONSE, - &mds->add_ctx.pending_extents, 0); + rc =3D cxl_add_pending(mds); clear_pending_extents(mds); return rc; } diff --git a/drivers/cxl/core/region_dax.c b/drivers/cxl/core/region_dax.c index d6bf69155827..519e203c486a 100644 --- a/drivers/cxl/core/region_dax.c +++ b/drivers/cxl/core/region_dax.c @@ -13,6 +13,7 @@ static void cxl_dax_region_release(struct device *dev) { struct cxl_dax_region *cxlr_dax =3D to_cxl_dax_region(dev); =20 + xa_destroy(&cxlr_dax->dc_extents); kfree(cxlr_dax); } =20 @@ -57,11 +58,13 @@ static struct cxl_dax_region *cxl_dax_region_alloc(stru= ct cxl_region *cxlr) if (!cxlr_dax) return ERR_PTR(-ENOMEM); =20 + xa_init_flags(&cxlr_dax->dc_extents, XA_FLAGS_ALLOC); cxlr_dax->hpa_range.start =3D p->res->start; cxlr_dax->hpa_range.end =3D p->res->end; =20 dev =3D &cxlr_dax->dev; cxlr_dax->cxlr =3D cxlr; + cxlr->cxlr_dax =3D cxlr_dax; device_initialize(dev); lockdep_set_class(&dev->mutex, &cxl_dax_region_key); device_set_pm_not_required(dev); diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 5ef2cf4d005b..cbbfba92fea9 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -495,6 +495,7 @@ struct cxl_region_params { * @type: Endpoint decoder target type * @cxl_nvb: nvdimm bridge for coordinating @cxlr_pmem setup / shutdown * @cxlr_pmem: (for pmem regions) cached copy of the nvdimm bridge + * @cxlr_dax: (for DC regions) cached copy of CXL DAX bridge * @flags: Region state flags * @params: active + config params for the region * @coord: QoS access coordinates for the region @@ -510,6 +511,7 @@ struct cxl_region { enum cxl_decoder_type type; struct cxl_nvdimm_bridge *cxl_nvb; struct cxl_pmem_region *cxlr_pmem; + struct cxl_dax_region *cxlr_dax; unsigned long flags; struct cxl_region_params params; struct access_coordinate coord[ACCESS_COORDINATE_MAX]; @@ -568,6 +570,15 @@ struct cxl_dax_region { struct device dev; struct cxl_region *cxlr; struct range hpa_range; + /* + * dc_extents is keyed by an allocator-assigned u32 (see + * online_tag_group()). Tag groups have no first-class identity in + * this xarray; siblings within a tag find each other via + * dc_extent->group. Tag-uniqueness lookup is a linear xa_for_each + * walk, adequate at the bounded per-region extent counts the + * driver handles. + */ + struct xarray dc_extents; }; =20 /** @@ -595,6 +606,14 @@ struct cxl_dc_tag_group { unsigned int nr_extents; }; =20 +bool is_dc_extent(struct device *dev); +static inline struct dc_extent *to_dc_extent(struct device *dev) +{ + if (!is_dc_extent(dev)) + return NULL; + return container_of(dev, struct dc_extent, dev); +} + /** * struct cxl_port - logical collection of upstream port devices and * downstream port devices to construct a CXL memory diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild index 2be1df80fcc9..8941cf187462 100644 --- a/tools/testing/cxl/Kbuild +++ b/tools/testing/cxl/Kbuild @@ -63,7 +63,10 @@ cxl_core-y +=3D $(CXL_CORE_SRC)/hdm.o cxl_core-y +=3D $(CXL_CORE_SRC)/pmu.o cxl_core-y +=3D $(CXL_CORE_SRC)/cdat.o cxl_core-$(CONFIG_TRACING) +=3D $(CXL_CORE_SRC)/trace.o -cxl_core-$(CONFIG_CXL_REGION) +=3D $(CXL_CORE_SRC)/region.o $(CXL_CORE_SRC= )/region_pmem.o $(CXL_CORE_SRC)/region_dax.o +cxl_core-$(CONFIG_CXL_REGION) +=3D $(CXL_CORE_SRC)/region.o \ + $(CXL_CORE_SRC)/region_pmem.o \ + $(CXL_CORE_SRC)/region_dax.o \ + $(CXL_CORE_SRC)/extent.o cxl_core-$(CONFIG_CXL_MCE) +=3D $(CXL_CORE_SRC)/mce.o cxl_core-$(CONFIG_CXL_FEATURES) +=3D $(CXL_CORE_SRC)/features.o cxl_core-$(CONFIG_CXL_EDAC_MEM_FEATURES) +=3D $(CXL_CORE_SRC)/edac.o --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f45.google.com (mail-dl1-f45.google.com [74.125.82.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1895F3876C5 for ; Sat, 23 May 2026 09:44:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529448; cv=none; b=kJYMwalzasNmY6DXvyfhSDINhRFeUAUTkK1cPw9lrIraXfAzGcv6XfpQJJ+HvWX2A741RvVyVcQNWc1lLmu2GmUkyVCEIRDmjdmvvZ8C6BH1lye9hfnorg+3jJsJ7tPC50DE3i0X58OUfjKU+MJD8bLTsov8lXiZbHNPllIJdKo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529448; c=relaxed/simple; bh=WjyLoGSIFNUpwvHMOZu5vAsTWSXjF5U+cHQ9JDs0GLA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=soqUzAu443jZJVjnOQimmsJDMvqEppcRsNCMbCPDf++XSE9zr0CMRUER7F5VvU0g4a7Nd7KncKU7uipLkh7fThlqz6iDFuPiEZF7D9yrHEhe6o4rBDDIszbAT0tuNRw0ZuMiqWca65WxV5xKZDHrYesxtoDjFLEnROp1g80msbk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=V2ODoJ6j; arc=none smtp.client-ip=74.125.82.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="V2ODoJ6j" Received: by mail-dl1-f45.google.com with SMTP id a92af1059eb24-13621cca8f5so5370837c88.0 for ; Sat, 23 May 2026 02:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529446; x=1780134246; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Zm5D4zZV0MwCvac4jjfgUUR1LIZ7RrjaKcvPmfxl/f8=; b=V2ODoJ6jdKu1rCFh6vywf4VdKKfsWV8liyj7MLVt2na1qS7LuXyM45UyNNJxtxtfEp i6a26lLoJdg7ezf+mGfNgtWprsq+OE6ODreweYlPnYbJ9VT318tfu5SQlJh8UIu+pH3C ZPEuP77lUS6nI/EGWvB8ztnpsHF8wJpeNDcM4Yr6wLHwBwef2G2VV9f5BDLuMjNZHriH ZmLe1Gg3PtWl3S1tJAyzi+KlBca68eo97c35Khx7JGLqfVkJZTArfgKfPhqa1QIGrqKb uJMYCpSqzFmS2VVHsxh6vvxF0XFV+dVrsHrVOH20nItl9ovyxjaCirdpBvXd+PDRrRgO wPAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529446; x=1780134246; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Zm5D4zZV0MwCvac4jjfgUUR1LIZ7RrjaKcvPmfxl/f8=; b=J5O/ifEbM3nMs5YkaDWyxubLHNOXLLOgCFOcSSDITob4tSMsKNNIZIXZ1OquBrasRy f2SvfmlkeO3iVh+9HaAh5in4X2/iU8IdP1nd4JSX8XAhdYwWSOaFcD0F24y1OjIWl5rc VTPh61FUGJADggu9GvAJjvIS+cNlSkMbTqoRPaDILKAwAafjNU05thnD594+2ddsdien ZH11rgU9eGg0Ljh0bAkP9LB1dmF5JQjUEaK/eInlNaY+vWa8nNAl4AgIH1tnX6Hd82oZ JlTNDWn1n0PahpDjjlsb5mFimdEvonoyFM59V8SdJI8225ekUhtwSG7Scgch0wfl04/l HinA== X-Forwarded-Encrypted: i=1; AFNElJ8oTxNlRdnAvdk9mpznWzIIwazzQQC7X24AEwIBm0fGV6IdKfg5JRYOf+9t6xH4kc5KO3mHjVkhNyteQpI=@vger.kernel.org X-Gm-Message-State: AOJu0Yzx3nfnDovZHmMHa2pxnQ+UmHBqNNCrubYlgZyOl0OPLKDjibe2 cQbrOKgliy+HZ8S3Le6jwY5WQwZOcqzuf7dX01mhq+aUuIjypkBUYSHS X-Gm-Gg: Acq92OEOrTU65+dj7044sEWXQpYXK8LZQy2JYRTzPFdE3Y64YLMexkeYOj6T/HCdwW7 DplN/uoBV3Y2J+nqm+SnRoo+21wGISd1+3oNppvQqvMtZOOxjTqwYip0ym4eVneGv3nSJUccV4B smn/S0PJBGUVegV+ZigG9LHJ7TWwLzuxVxKztfedegTKNrmPzqlF1ss7NvjllgaDDT6hIKITCzf ZyZFA2NJE38R70VpQO0ocMN2KZPXpx1Xzmd/mVOLvV0utLHz/QXTCV7gMep7vt1nUUjwYvw2oI7 IRtPCWLcgELB0o2UPwYBI0wvDIyIPXI7emg64Fqiqsz6qLGmkcirR3gyogNwFMV/vhCPROegYez 51G28rftpVr6hM9+DFeHi95JE040OV1oIDnX1Wzs4W8pYralCrxSDfH1R8nNuuObbAc4v5wymFd DJLPvIqA7DFGRLH0ERLrw+TTPzpME/5Oag6mP2BS1ZpFslNetNEM3rPnQg88Rfh4zzGUy5Fk4Ru 9p33dvGssvyrnrIIg== X-Received: by 2002:a05:7022:1286:b0:11a:e426:911a with SMTP id a92af1059eb24-1365f821ccbmr3116310c88.15.1779529446232; Sat, 23 May 2026 02:44:06 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:05 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 15/31] cxl/mem: Drop misaligned DCD extent groups Date: Sat, 23 May 2026 02:43:09 -0700 Message-ID: <60e23199f7ef7dd3008bb3275c40d242334275c9.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add an alignment gate to cxl_add_pending(): every extent in a tag group must have its start_dpa and length aligned to CXL_DCD_EXTENT_ALIGN (SZ_2M, the minimum device-dax mapping granularity on every architecture that enables CXL DCD). A misaligned extent makes the resulting dax device unusable, so drop the whole group rather than accept a partial allocation that would surface a broken dax_resource. Based on patches by John Groves. Signed-off-by: Ira Weiny Signed-off-by: John Groves Signed-off-by: Anisa Su --- Changes: [anisa: split out as a separate validation step] --- drivers/cxl/core/mbox.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index e5edc3975e8f..421bd716a273 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -1280,6 +1281,24 @@ static int add_to_pending_list(struct list_head *pen= ding_list, return 0; } =20 +/* + * Device-dax requires extent boundaries aligned to its mapping granularit= y. + * Use SZ_2M as a conservative default; a tighter check that queries the + * cxl_dax_region / cxl_endpoint_decoder for its actual alignment would be + * strictly more correct, but SZ_2M is the minimum device-dax supports on + * every architecture that enables CXL DCD today. + */ +#define CXL_DCD_EXTENT_ALIGN SZ_2M + +static bool cxl_extent_dcd_aligned(const struct cxl_extent *extent) +{ + u64 start =3D le64_to_cpu(extent->start_dpa); + u64 len =3D le64_to_cpu(extent->length); + + return IS_ALIGNED(start, CXL_DCD_EXTENT_ALIGN) && + IS_ALIGNED(len, CXL_DCD_EXTENT_ALIGN); +} + /* * Compare two extents by shared_extn_seq (ascending). list_sort is * stable so when shared_extn_seq is 0 for every entry (non-sharable @@ -1352,6 +1371,26 @@ static int cxl_add_pending(struct cxl_memdev_state *= mds) extract_tag_group(pending, &tag, &group); list_sort(NULL, &group, extent_seq_compare); =20 + /* Alignment gate =E2=80=94 abort the group if any member fails */ + bool aligned =3D true; + list_for_each_entry(pos, &group, list) { + if (!cxl_extent_dcd_aligned(pos->extent)) { + dev_warn(dev, + "Tag %pUb: dropping group, extent DPA:%#llx LEN:%#llx not %u-aligned= \n", + &tag, + le64_to_cpu(pos->extent->start_dpa), + le64_to_cpu(pos->extent->length), + CXL_DCD_EXTENT_ALIGN); + aligned =3D false; + break; + } + } + if (!aligned) { + list_for_each_entry_safe(pos, tmp, &group, list) + delete_extent_node(pos); + continue; + } + u16 logical_seq =3D 1; list_for_each_entry_safe(pos, tmp, &group, list) { u16 raw =3D le16_to_cpu(pos->extent->shared_extn_seq); --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f50.google.com (mail-dl1-f50.google.com [74.125.82.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B0353876BB for ; Sat, 23 May 2026 09:44:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529451; cv=none; b=T+m9oBdTrGHM6DH8rnJes15yL+W5oKuUo+4fbMW1gteb9RgIJnZKCUVTrFjzjAtZm+Cu6yNk+SV2LRuZ/7W9gIzA4aAhsaUjXHqTScJodSy/4K1CcnB9Ba4FkDHtGJ/R+bm3svn5MXEtR5O3dgXUydvHs0ufyJ6cNLxWBpel8hA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529451; c=relaxed/simple; bh=+Fs7s5G0qsALoEg5+n2VxXmV7ZgzgjcZhlIbWf49dqI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=i5YR+l3RtSgVBdaCLvmfgAkTBjd+yo0zFXeCZRzMv3Yry1iJnYvU30Xiv0HUuYgAHI1PnSRd/ifxBYuu+dhQGv3RJdT+AK3EBtlJ3yWhtd8bxUhwtF00vw5I1QZTJcTSpJxyjSbGsmPqUUSwhMMzMAZ5fCn0BjIEIR8CwuZHydM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MCTztp/C; arc=none smtp.client-ip=74.125.82.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MCTztp/C" Received: by mail-dl1-f50.google.com with SMTP id a92af1059eb24-135e88b8e55so3425210c88.0 for ; Sat, 23 May 2026 02:44:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529448; x=1780134248; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lmIYAuyAsD2n/pMyj9L4/1IUWr8KglMpbMYEoiJlFTY=; b=MCTztp/C+i8seJ55Nim5KVKg+NeUCNld+NDuTn7TPmVoW7efCtgotIqc1KaSbAUGeh wsjG6PrMkxVYwmm5uYbrO+5cN2PXZ/JF3td0F9tXnIMLIbGCwwqR2rTAPRmMlhIjlEVl /TaWA5SLdL9t2rAzvdpOroSn7la9jbA4MBw9JGu0rDdupzjsdP5G2C5t3uh3iTkNRFIi IQHJAylgW2rDp9aLU49J5P0DL4ozvgfKSZFISfkDKXejuANAX2Mw93o1xAiyzB2tZsZg 7CuJ8EQKM78zl3/D82wsxS7DphQaFLBmTFZN1tELidWLz+k6BaFWk8k8ZOowWcjEDofI 4g4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529448; x=1780134248; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lmIYAuyAsD2n/pMyj9L4/1IUWr8KglMpbMYEoiJlFTY=; b=sjm6cc54Eol+OmrX0Ddlx+dEspnwOSb3Ogu3+ugOh3jXnPrWDNFclSBkRCxEKRGHFR uV/tX04JrIC2BZYv9KvsRiPNSrEScWmzNIvK6HLK+FbVNbqIxReF77lBchXORX8XnpRl sTdtbDQ4CyvIidVAE5ne1hCWr+hZIu0CS1vhZi9+ZHPRuvUumwuEt09YPjF2R2Kzl3yT XzE1C3V8FCs7WkewHtY/umefDeH8CQo7PKBhr/MUE/s50oTsGiz52xieuh+mppC1C4lV cZXbkDzT9QIS/eqske57B9sSi37KpEKWHVQyA7jtiqta0BuAmmpbDKBnS9e3fM1sCDZM XQYA== X-Forwarded-Encrypted: i=1; AFNElJ+3N+uih3InfXuxIdRyC8FACnX2zOks6xuUyMGHc0PbfFMljlWoxEnvmOptEboJJ1bZMYTYm7YezHUQz+k=@vger.kernel.org X-Gm-Message-State: AOJu0YxVZxiBsWFM1aSEOsv1OKh2n2uuEVk4Q/ZB3z0v6aEEoXKLbF8H pGwUvtCdWIVV8SpG9Fo786kkXeCqJupEWjkilXDI1R2aC3VSYx57Gwu3 X-Gm-Gg: Acq92OF54U7zZLRlIwuLVAm/s7V51HQfNhb1VdXDnplqjLG7vsPXY5ED0IPC1krcK7G ceaoAIWhl4kljuVCWOXzgFVX+bVYCfbXccYiWZu7n2I+oLXLJtVgzOdy7DRHkcCUKNlQdTn3ezY 31WNFNoizepSviTPlGQnK4M8P0s+hJoxBlV5JYcySi8+CMLDUNEIZ4UFJhLth2Yleb75yM0V0Ip xhPlMBlST5AHPEOqhIe1nmhp+D8215uQBmT1s4B6GrCyPyXOjDlYbGf5DK5EVE0sHDJHBOcmkiL +V+A5PFja+l3urke3yVdEd2EG6U/4VsxTWLc0It7QCGY4tX6fdx66tTS62sXG+iJooB4kxQb/oR ucMJq9mUNTzNvsbu2wyeFydVRcSznGr8bgOfti9CP6Z3cdwCDCXTM8DlhKWjyh9Srk9cfiACrLK YU58mizxCTDeRpUSAie1HQvX1Gljy7y5cWgBbAtuhuArV+L1x2cFgHNw4daX/IqZgicWg4vvo/C vb2Ayo= X-Received: by 2002:a05:7022:128a:b0:12d:de3e:86a8 with SMTP id a92af1059eb24-1365fc774a3mr2758746c88.38.1779529448111; Sat, 23 May 2026 02:44:08 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:07 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 16/31] cxl/extent: Validate DC extent partition Date: Sat, 23 May 2026 02:43:10 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Extend cxl_validate_extent() =E2=80=94 the per-extent check of the add pipe= line to check partition membership. Resolves an extent's DPA to its containing DC partition. Then based on if the partition is shareable: - Shareable: tag must be non-null and shared_extn_seq must be non-zero =E2=80=94 multiple hosts reading the same allocation rely on the device- stamped 1..n sequence to assemble extents in agreed order. - Non-sharable: shared_extn_seq must be zero =E2=80=94 sequencing is meaningless when only one host consumes the allocation; tag is optional (null UUID permitted). Any cross-mix is a device firmware bug; reject the extent. Based on patches by John Groves. Signed-off-by: Ira Weiny Signed-off-by: John Groves Signed-off-by: Anisa Su --- Changes: [anisa: split out as a separate validation step] --- drivers/cxl/core/core.h | 4 ++ drivers/cxl/core/extent.c | 78 +++++++++++++++++++++++++++++++++++++-- 2 files changed, 79 insertions(+), 3 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 1bae80dbf991..30b6b05b155b 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -179,6 +179,10 @@ int cxl_pci_get_bandwidth(struct pci_dev *pdev, struct= access_coordinate *c); int cxl_port_get_switch_dport_bandwidth(struct cxl_port *port, struct access_coordinate *c); void memdev_release_extent(struct cxl_memdev_state *mds, struct range *ran= ge); +const struct cxl_dpa_partition * +cxl_extent_dc_partition(struct cxl_memdev_state *mds, + struct cxl_extent *extent, + struct range *ext_range); =20 static inline struct device *port_to_host(struct cxl_port *port) { diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index 94128d06f4ed..b01507022cff 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -63,11 +63,55 @@ alloc_tag_group(struct cxl_dax_region *cxlr_dax, uuid_t= *uuid) return no_free_ptr(group); } =20 +/* + * Find the DC (Dynamic Capacity) partition that fully contains @ext_range, + * or NULL if the extent falls outside every DC partition on this memdev. + * The returned pointer is owned by mds->cxlds.part[] and lives for the + * lifetime of the memdev. + */ +const struct cxl_dpa_partition * +cxl_extent_dc_partition(struct cxl_memdev_state *mds, + struct cxl_extent *extent, + struct range *ext_range) +{ + struct cxl_dev_state *cxlds =3D &mds->cxlds; + struct device *dev =3D mds->cxlds.dev; + + for (int i =3D 0; i < cxlds->nr_partitions; i++) { + struct cxl_dpa_partition *part =3D &cxlds->part[i]; + struct range partition_range =3D { + .start =3D part->res.start, + .end =3D part->res.end, + }; + + if (part->mode !=3D CXL_PARTMODE_DYNAMIC_RAM_A) + continue; + + if (range_contains(&partition_range, ext_range)) { + dev_dbg(dev, "DC extent DPA %pra (DCR:%pra)(%pU)\n", + ext_range, &partition_range, extent->uuid); + return part; + } + } + + dev_err_ratelimited(dev, + "DC extent DPA %pra (%pU) is not in a valid DC partition\n", + ext_range, extent->uuid); + return NULL; +} + /* * Stage 1 of the add pipeline: pure, no allocation. Resolve the extent - * to its region/endpoint decoder and ext_range, and verify the range - * fits in the resolved endpoint decoder's DPA resource. Further - * per-extent invariants layer into this function in subsequent commits. + * to its region/endpoint decoder and ext_range, and enforce every + * per-extent invariant the device must satisfy: + * + * - DPA falls inside a Dynamic Capacity partition (cxl_extent_dc_partit= ion). + * - CDAT-sharability rules: + * sharable: tag must be non-null AND shared_extn_seq !=3D 0 + * non-sharable: shared_extn_seq must be 0 (tag is optional) + * Any cross-mixing is a device firmware bug. + * - DPA resolves to an endpoint decoder attached to a region. + * - The extent's range is fully contained in that ED's DPA resource. * * Caller must hold cxl_rwsem.region for read (cxl_dpa_to_region()). * On success, @out_cxled / @out_cxlr_dax / @out_ext_range carry the @@ -81,6 +125,10 @@ static int cxl_validate_extent(struct cxl_memdev_state = *mds, { u64 start_dpa =3D le64_to_cpu(extent->start_dpa); struct cxl_memdev *cxlmd =3D mds->cxlds.cxlmd; + struct device *dev =3D mds->cxlds.dev; + uuid_t *uuid =3D (uuid_t *)extent->uuid; + u16 seq =3D le16_to_cpu(extent->shared_extn_seq); + const struct cxl_dpa_partition *part; struct cxl_endpoint_decoder *cxled; struct cxl_region *cxlr; struct range ext_range =3D (struct range) { @@ -89,6 +137,30 @@ static int cxl_validate_extent(struct cxl_memdev_state = *mds, }; struct range ed_range; =20 + part =3D cxl_extent_dc_partition(mds, extent, &ext_range); + if (!part) + return -ENXIO; + + if (part->perf.shareable) { + if (uuid_is_null(uuid)) { + dev_err_ratelimited(dev, + "DC extent DPA %pra: sharable-partition extent has null tag (firmware = bug)\n", + &ext_range); + return -ENXIO; + } + if (seq =3D=3D 0) { + dev_err_ratelimited(dev, + "DC extent DPA %pra (%pU): sharable-partition extent missing shared_ex= tn_seq (firmware bug)\n", + &ext_range, uuid); + return -ENXIO; + } + } else if (seq !=3D 0) { + dev_err_ratelimited(dev, + "DC extent DPA %pra (%pU): non-sharable partition but shared_extn_seq= =3D%u (firmware bug)\n", + &ext_range, uuid, seq); + return -ENXIO; + } + cxlr =3D cxl_dpa_to_region(cxlmd, start_dpa, &cxled); if (!cxlr) return -ENXIO; --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f42.google.com (mail-dl1-f42.google.com [74.125.82.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7971838758E for ; Sat, 23 May 2026 09:44:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529453; cv=none; b=fXAg0aE9RQo6kxKB5ke2WVFavuHL123F6gvl3NQgsKkvAVnBMyS2wp3/utraTT/LoFMa8eE8Ku+SWUK9ha5szfdXxAxsJ02XblhIeRTqYZeGwySn92aeq/BGk5GT5FWgEsifIFkISIt++ynISDCcn+pgq/OifCPZiYudfj7gk8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529453; c=relaxed/simple; bh=gKHsMOo3U8ogPTf8EO+rxX/WXeg527xCY1DR9mB6g58=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sK9s1cDsVXQPACbjUp/PJ6uAO5aS0Jm15IkuXU4mTGd7sUe4UVF3t3wZ28Z8wpnxYzQ28d/E48DSoL19WvDYK4t7olIOJXQHWgvXUl/siZrPTRTOSql0wVXboZX/CTPxG9ZoBd0AJzJ6pxffCRlY6JQUW4VqOVtcR5jev1dgDgY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=O/x0CS9t; arc=none smtp.client-ip=74.125.82.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="O/x0CS9t" Received: by mail-dl1-f42.google.com with SMTP id a92af1059eb24-134ac81c445so10218784c88.1 for ; Sat, 23 May 2026 02:44:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529451; x=1780134251; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MqVfs6tjDDURbhRxkrKwQ4ZptDDu5a3li/NIrzqLAV8=; b=O/x0CS9tut8yP3huxrvkuEELZW8iy5YrOHrFXHoeGUEV5my7vRwHNlUMfXoy0aM+PL ZU5h+BiSWZ5/RUFEwkamlmpj51R/tSoVSnzb+RjHZ7Naz8UERz5JRHNs8UYgNioe7JKa RCfDNNuGC3ujcTx7Y9P/DhSb8b3H+Ee+AoXSf1SNOLDGOztJz6BwXPiSmu5d8WkLJ4wI 7OI/aGmfQnvNVa0iN62sxu0VBCm7lExtC3ZBzmjaUxjPkw3uBpnBj/ePtsThA8Fm7qkX 4pdxhxaroSZMbCCA0obVIc5T2Sb9fobkxmmNTShEBjrlyScp2/F9CBH50YixRC1SPy6R 18Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529451; x=1780134251; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MqVfs6tjDDURbhRxkrKwQ4ZptDDu5a3li/NIrzqLAV8=; b=iP1irQWimK+VPh4bVhKhMO43gRIUYwvIpMQ1WCRJappQBBMJsJBOilAkbBq0rNTT+z n0gtKYofKA5ywuFUIphpqqzm5dEW3rc0XLXBbsekEh7ZBUUvHsoaA7XhvUpOu83zC0Uk rMtX9c97GB37VWXPkGJsq2QWhR8wowGmPdtmaF7lUTwVh391E4YyZn/WeX2kfBpgFc4v kJ31OE66j7K5qk6HSFNdwY65xuACmlCYQ06MxQwClyL2j+AnSFWz2IRwydlPQX1Df1Pp i2kc/ZMRCKXsmpCA/JN2XzCJtBqPzZDlRGO8R8mMmXYzx8gOkZ987g+2rlyIOyeyBAaZ Ygfw== X-Forwarded-Encrypted: i=1; AFNElJ97OSGzRo2H4jniq27NnVWnbw2ltjGX2cla6YLb3x1vuPfJNhy9xLxWYEQGBMADfVcdzLpVbb76Ij3rHGw=@vger.kernel.org X-Gm-Message-State: AOJu0YwHacWz/+YPq0zDEKrg3XLhj5fvx/yHuU10XzRzJ4Duy8juGQrq 8Bk/jBDZNvB7fO/dq8+J67dOTigLLgdcmWWX5GYF6/OOprUpu28aJc5A X-Gm-Gg: Acq92OE4W638V3hVX6A9wscmc90J0hwNpToTI1fslcoXozvmiLbq2AXccwD7NUSajsL fuEqAaQfhR0gaD74I6PUxxHeyvBoyNTrCWZoHnTkVgOV6Ou5Kn2hZizHajy8AYtUvASVfudARiU YymA8siudzgoMYa9j7ZV1uB6+4Pzrpnf75rJA7hDOhm4nfUJw6zRwRSjNjDiSJ7pPVfKMDLDPZe KXxqXU9MOUvd9fII5ydyWySRv6PsXsJiEnxZI07mBeNk4UL3bW9eTWjHiIhWE42gUe1mLTHfgPY qWzFF/QN9D2D5cYqWNHvbN7ZDf6eOEGLThGxqgkTVSxkFeB7hhv0e3NEte3u6Tw4hbxUeTi6HYi pzrsYvgE9+2VYeE4hxdpWK0UZ36YK580NMYFwgjdHE3wOlfFiDmq1kppAnS+veYZfTCrnV+YwTL sBkS/qXUf+AzNcEBayXWUw8CxPQYJGxu1/T8H/MHOWTUPpKLmGz2SX9Q29yxOcnNoSVXghqb8bk Rgw52g= X-Received: by 2002:a05:7022:ff48:b0:12a:6fb7:87e3 with SMTP id a92af1059eb24-1365fc6d7demr3181465c88.31.1779529450522; Sat, 23 May 2026 02:44:10 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:10 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 17/31] cxl/mem: Enforce tag-group semantics Date: Sat, 23 May 2026 02:43:11 -0700 Message-ID: <9e1f5b0b36fd1607691c649bd39abecf2e60f8e6.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The previous commit fully-fleshed out validation for individual extents. This commit completes tag-group validation. Add two group-level gates to cxl_add_pending() that cxl_validate_extent()'s per-extent view can't see: - Sequence integrity (cxl_check_group_seq): well-formed iff. every member is shared_extn_seq =3D=3D 0 (non-shareable) or the sorted group is exactly 1..n contiguous (shareable). - Partition equality (cxl_check_group_partition): tagged allocations cannot span DC partitions; a partition's CDAT DSMAS entry is the unit at which shareable / writable / coherency attributes are described. Skipped for the null UUID. Each check drops the whole group on violation. Cross-chain uniqueness of a tag lands in a subsequent commit alongside the host-wide tag registry. Based on patches by John Groves. Signed-off-by: Ira Weiny Signed-off-by: John Groves Signed-off-by: Anisa Su --- Changes: [anisa: split out as a separate validation step. Cross-chain uniqueness moved to the dedicated "Enforce cross-region tag uniqueness" commit so this one only adds =E2=80=94 no add-then-replace.] --- drivers/cxl/core/mbox.c | 117 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 421bd716a273..545c48c9c373 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1342,6 +1342,109 @@ static void extract_tag_group(struct list_head *pen= ding, } } =20 +/* + * Validate shared_extn_seq across a tag group already sorted ascending. + * + * A tag group is well-formed iff either every member has + * shared_extn_seq =3D=3D 0 (non-sharable allocation) or the sorted group = is + * exactly 1, 2, ..., n (sharable). Anything else =E2=80=94 mix, gap, dup= licate, + * non-zero starting other than 1 =E2=80=94 is a device firmware bug. + */ +static int cxl_check_group_seq(struct device *dev, + const uuid_t *tag, + const struct list_head *group) +{ + struct cxl_extent_list_node *pos; + u16 first, expected; + + if (list_empty(group)) + return 0; + + pos =3D list_first_entry(group, struct cxl_extent_list_node, list); + first =3D le16_to_cpu(pos->extent->shared_extn_seq); + + if (first =3D=3D 0) { + list_for_each_entry(pos, group, list) { + if (le16_to_cpu(pos->extent->shared_extn_seq) !=3D 0) { + dev_warn(dev, + "Tag %pUb: shared_extn_seq mixed 0/non-zero in one allocation (firmw= are bug)\n", + tag); + return -EINVAL; + } + } + return 0; + } + + if (first !=3D 1) { + dev_warn(dev, + "Tag %pUb: shared_extn_seq starts at %u, expected 1 (firmware bug)\n", + tag, first); + return -EINVAL; + } + + expected =3D 1; + list_for_each_entry(pos, group, list) { + u16 s =3D le16_to_cpu(pos->extent->shared_extn_seq); + + if (s !=3D expected) { + dev_warn(dev, + "Tag %pUb: shared_extn_seq gap/dup: expected %u got %u (firmware bug)= \n", + tag, expected, s); + return -EINVAL; + } + expected++; + } + return 0; +} + +/* + * For tagged groups, reject allocations that span DC partitions. A tag + * is an allocation identity; the partition's CDAT DSMAS entry is what + * tells the host which attributes (sharable, writable, coherency) + * apply. Untagged groups are skipped =E2=80=94 the spec does not define a + * cross-chain identity for them. + */ +static int cxl_check_group_partition(struct cxl_memdev_state *mds, + const uuid_t *tag, + const struct list_head *group) +{ + struct device *dev =3D mds->cxlds.dev; + const struct cxl_dpa_partition *first_part =3D NULL; + u64 first_dpa =3D 0; + struct cxl_extent_list_node *pos; + + if (uuid_is_null(tag) || list_empty(group)) + return 0; + + list_for_each_entry(pos, group, list) { + struct cxl_extent *extent =3D pos->extent; + struct range ext_range =3D (struct range) { + .start =3D le64_to_cpu(extent->start_dpa), + .end =3D le64_to_cpu(extent->start_dpa) + + le64_to_cpu(extent->length) - 1, + }; + const struct cxl_dpa_partition *part; + + part =3D cxl_extent_dc_partition(mds, extent, &ext_range); + if (!part) + return -ENXIO; + + if (!first_part) { + first_part =3D part; + first_dpa =3D ext_range.start; + continue; + } + + if (part !=3D first_part) { + dev_warn(dev, + "Tag %pUb: extents span DC partitions (DPA:%#llx and DPA:%#llx), firm= ware bug\n", + tag, first_dpa, ext_range.start); + return -EINVAL; + } + } + return 0; +} + /* * Drive the pending Add-Capacity records through cxl_add_extent(), * grouped by tag. Per group: extract from pending, stable-sort by @@ -1371,6 +1474,20 @@ static int cxl_add_pending(struct cxl_memdev_state *= mds) extract_tag_group(pending, &tag, &group); list_sort(NULL, &group, extent_seq_compare); =20 + /* Sequence-number integrity */ + if (cxl_check_group_seq(dev, &tag, &group)) { + list_for_each_entry_safe(pos, tmp, &group, list) + delete_extent_node(pos); + continue; + } + + /* Partition equality (skipped for null UUID) */ + if (cxl_check_group_partition(mds, &tag, &group)) { + list_for_each_entry_safe(pos, tmp, &group, list) + delete_extent_node(pos); + continue; + } + /* Alignment gate =E2=80=94 abort the group if any member fails */ bool aligned =3D true; list_for_each_entry(pos, &group, list) { --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f48.google.com (mail-dl1-f48.google.com [74.125.82.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB9AA3876C1 for ; Sat, 23 May 2026 09:44:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529455; cv=none; b=KLqMMbpG6/0+8AYsojIZ9AP62n06jcexN7Iap9Y2BFJLt8dngK3sow0JkCNx07MSZoRbeinHxDMZsiJoq9VWCh/ZY86ESgDSog8CBwGsgAvks7mD+xYNPmVAQscE2Yrd6TdTEAHMYR5wL6lOX1tPuWRQ+QJcOle+AlxNt+4z7Hg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529455; c=relaxed/simple; bh=GS+yMp5c/jTLNQ20LGU0OkNzkvMy6q0vA0h1FBrJHOI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=LCxah+lnAoJkUeKn6Kf4jFKGgpnUoBbUyv+1gHB/eLSiEnS4hK+i1VKt/CzDuNlP/z3ayzhB+wShkrYMsb7Eq0e14C5IGvsaxoA7eVsZp03H8UYDoZhM1SN97X4T3LFI8ApUYdV3eC9a9ScH+VRl8Jv1hbRo4Z2iOGe+r2Kn1Ek= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Tui+/81p; arc=none smtp.client-ip=74.125.82.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Tui+/81p" Received: by mail-dl1-f48.google.com with SMTP id a92af1059eb24-132830d8281so2577493c88.1 for ; Sat, 23 May 2026 02:44:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529453; x=1780134253; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hiEYQbicDj1JkgORlqjRwmZPKqd/B9x9VxQhFj4OIII=; b=Tui+/81piEfbmyhvxR8//o2JxLDOiBevv5PlL3KIhLKRLBHGWjFcXYYuyiMMQ82mQI pQTR5DsunHgv5UrLuMPLTiVkrYGp0XAO7LcejlNA9az8KRcLDdCYbIikbFj3lx1HQeB/ bZEKCusho87l8WYj3TVINiXsLLM40KM6sAyqGPrfwoOaQH5RGiG8qvbYWG8u0KwQKF6b kQnvmzbmR80Sx8MKqVmYSZ7lyPYAfdRIQEJb1esg41zJBuQVy11TIA2RJdxDYpDGtduC SeP6q2yKls4wP5cwJtJk2fKbZ5upeEGqwfaSpQuBjOM7UD2+63FgDawVEI7OuMaoGevH d87Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529453; x=1780134253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hiEYQbicDj1JkgORlqjRwmZPKqd/B9x9VxQhFj4OIII=; b=L0p6c620TO+EnBqZWWO/80czYfaEIw/FImFG1fY9YqO7BWvGouVfZutkscQvksk3z8 IhNWSQPZD/A4MKtzgC21v4ke2WP8zoTkxRAPR86qHtz4nG7RURoz1hOCoGsjK5iU7nKk QvrLbA8hFChMCTNgQneysCly8AvRLuNHRmzwVDZ3GV9pa1zzfiqFOhAHigrnikweluSv uHHwrB7iAlgOXH3X3sW25ORcZoOdAd4ZHAMwKAKwsK5yIsVOGKMv5fuSEBQWCHEkFdp0 YpTHl5maWRH3XK00ksYTvgkn/vNwN4l+Xqgf8jQG7jdNsCN3Okz2AmrKwNs/wr0WnBcg xbeA== X-Forwarded-Encrypted: i=1; AFNElJ8R5e/ovGt7S19A+sazJoZwozOt7GPnLscj7gL3mJAUXO1/ahES77Ws0L6dKFCu7OfeX72rZn/uJnQ6abs=@vger.kernel.org X-Gm-Message-State: AOJu0YzTRvjp23aJHbV7CnmRtK24mNzY0jwBJTvxZKHppkhDX2q74krR H33ZvMgV3sVlbw0PpV8GH41BW3p3S6a1vrg4T8pqGadbi9Okk+ZZUzPO X-Gm-Gg: Acq92OGnsc/GAIpfgcJ374ShPiq22Mt0k6USsH3UIDImJCVmgjsGKxi/A+D2VkBvmIf 4OAPOKmNeXm2/1zflMcA0frXkGSbmI3dKFpzsE+UUcNidUaE56ZR9UhGxdXe01i1HUqvZavcnNB EuwZ3S2a96KOW45fThqbQ4K55uqF5DnaaYlMV9rBD3k/PavfL4PkT2/2OQKB9pWWWmwGlZCFBaK JCI+8RYB1jGYkYlf5F4uYm6yBA20/V1KIwIFPR81bv+gCFuXDdrLHWjhNq4lY1ihIhHHalllHib zjWG2NVL+LXshb1gcYQCnKLyJvnU083H9y0cQGiQBKHuxxlQatczVNOhrcIxLPbPt1L1nWmNcg/ jXBaos66leXMPnuaHFTK1uHPLwRuqFmJT4ttgGojP9DRIW+UjfAW9v9KAZT04PeaszVerPg+pEH jXcut5pzJEut8ii+jbd2jLNf9GaWt71rUBNbgKy5KHuCnt7zohZJBJrWU7nYlnrh5bi71lYuEG5 Wqf9DRsETx7Ykp9Yg== X-Received: by 2002:a05:7022:1605:b0:132:1e01:8737 with SMTP id a92af1059eb24-1365fb403b8mr2621962c88.26.1779529452731; Sat, 23 May 2026 02:44:12 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:12 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 18/31] cxl/extent: Handle DC Release Capacity events Date: Sat, 23 May 2026 02:43:12 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Replace the no-op ack stub for cxl_rm_extent() with the real teardown: resolve the released DPA range to its region and endpoint decoder, locate the matching dc_extent in cxlr_dax->dc_extents (filtering by cxled, range containment, and tag), and tear down the entire containing tag group atomically through rm_tag_group(). Partial release is not supported. rm_tag_group() invalidates caches once for the whole group (no mappings exist at this point =E2=80=94 partial release is not supported, so all memb= ers are leaving together), then walks the group's dc_extents and releases each via its devm action installed at online_tag_group() time. cxl_region_invalidate_memregion() becomes non-static and is declared in core.h so rm_tag_group() can flush caches before tearing the group down. When the released range maps to no region (host crashed before persisting acceptance, region destruction raced device release, or the device is confused) the host has nothing to drop, so reply via memdev_release_extent() to keep the device's view consistent. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: restructured from the original "Process dynamic partition events" monolith; this commit replaces the stubbed release with the real walk-and-tear-down of the matching tag group.] --- drivers/cxl/core/core.h | 8 +++ drivers/cxl/core/extent.c | 101 ++++++++++++++++++++++++++++++++++++++ drivers/cxl/core/mbox.c | 19 ------- drivers/cxl/core/region.c | 2 +- 4 files changed, 110 insertions(+), 20 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 30b6b05b155b..65daaaadf68e 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -28,6 +28,8 @@ cxled_to_mds(struct cxl_endpoint_decoder *cxled) return container_of(cxlds, struct cxl_memdev_state, cxlds); } =20 +int cxl_region_invalidate_memregion(struct cxl_region *cxlr); + #ifdef CONFIG_CXL_REGION =20 struct cxl_region_context { @@ -67,6 +69,7 @@ int devm_cxl_add_pmem_region(struct cxl_region *cxlr); =20 int cxl_add_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent, u16 seq_num); +int cxl_rm_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent); int online_tag_group(struct cxl_dc_tag_group *group); #else static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, @@ -79,6 +82,11 @@ static inline int cxl_add_extent(struct cxl_memdev_state= *mds, { return 0; } +static inline int cxl_rm_extent(struct cxl_memdev_state *mds, + struct cxl_extent *extent) +{ + return 0; +} static inline int online_tag_group(struct cxl_dc_tag_group *group) { return 0; diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index b01507022cff..51116c8139ed 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -344,6 +344,107 @@ static void dc_extent_unregister(void *ext) device_unregister(&dc_extent->dev); } =20 +static void rm_tag_group(struct cxl_dc_tag_group *group) +{ + struct device *region_dev =3D &group->cxlr_dax->dev; + struct dc_extent *dc_extent; + unsigned long index; + + /* + * Tagged allocations release atomically. Invalidate caches once + * for the whole group (no mappings exist at this point =E2=80=94 partial + * release is not supported, so all members are leaving use + * together) before tearing down each dc_extent device. + * + * Pin @group across the walk: each devm_release_action runs the + * dc_extent_unregister action synchronously, which drops the last + * reference on the dc_extent device and fires dc_extent_release. + * The release decrements group->nr_extents and, on the final + * decrement, frees @group. Without the pin the next iteration's + * xa_find_after() dereferences a freed xarray. + */ + cxl_region_invalidate_memregion(group->cxlr_dax->cxlr); + + group->nr_extents++; + xa_for_each(&group->dc_extents, index, dc_extent) + devm_release_action(region_dev, dc_extent_unregister, dc_extent); + group->nr_extents--; + if (!group->nr_extents) + free_tag_group(group); +} + +int cxl_rm_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent) +{ + u64 start_dpa =3D le64_to_cpu(extent->start_dpa); + struct cxl_memdev *cxlmd =3D mds->cxlds.cxlmd; + struct cxl_endpoint_decoder *cxled; + struct cxl_dax_region *cxlr_dax; + struct cxl_dc_tag_group *group; + struct dc_extent *dc_extent; + struct cxl_region *cxlr; + struct range dpa_range; + unsigned long idx; + uuid_t tag; + + dpa_range =3D (struct range) { + .start =3D start_dpa, + .end =3D start_dpa + le64_to_cpu(extent->length) - 1, + }; + + guard(rwsem_read)(&cxl_rwsem.region); + cxlr =3D cxl_dpa_to_region(cxlmd, start_dpa, &cxled); + if (!cxlr) { + /* + * No region can happen here for a few reasons: + * + * 1) Extents were accepted and the host crashed/rebooted + * leaving them in an accepted state. On reboot the host + * has not yet created a region to own them. + * + * 2) Region destruction won the race with the device releasing + * all the extents. Here the release will be a duplicate of + * the one sent via region destruction. + * + * 3) The device is confused and releasing extents for which no + * region ever existed. + * + * In all these cases make sure the device knows we are not + * using this extent. + */ + memdev_release_extent(mds, &dpa_range); + return -ENXIO; + } + + cxlr_dax =3D cxlr->cxlr_dax; + import_uuid(&tag, extent->uuid); + + /* + * Find the dc_extent whose DPA range covers the released range and + * whose tag matches. The release targets the entire containing + * tag group atomically; partial release is not supported. + */ + group =3D NULL; + xa_for_each(&cxlr_dax->dc_extents, idx, dc_extent) { + if (dc_extent->cxled !=3D cxled) + continue; + if (!range_contains(&dc_extent->dpa_range, &dpa_range)) + continue; + if (!uuid_equal(&dc_extent->group->uuid, &tag)) + continue; + group =3D dc_extent->group; + break; + } + if (!group) { + dev_err(&cxlr_dax->dev, + "release DPA %pra (%pU) matches no dc_extent\n", + &dpa_range, &tag); + return -EINVAL; + } + + rm_tag_group(group); + return 0; +} + static void cleanup_pending_dc_extent(struct dc_extent *dc_extent) { struct cxl_dc_tag_group *group =3D dc_extent->group; diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 545c48c9c373..70e6c4c9743c 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1587,25 +1587,6 @@ static int handle_add_event(struct cxl_memdev_state = *mds, return rc; } =20 -/* - * Stub: ack the release back to the device so it knows we are not - * using the range. A later commit replaces this with the real - * teardown that walks the region's tag group and tears down the - * member dc_extent devices. - */ -static int cxl_rm_extent(struct cxl_memdev_state *mds, - struct cxl_extent *extent) -{ - u64 start_dpa =3D le64_to_cpu(extent->start_dpa); - struct range dpa_range =3D { - .start =3D start_dpa, - .end =3D start_dpa + le64_to_cpu(extent->length) - 1, - }; - - memdev_release_extent(mds, &dpa_range); - return 0; -} - static char *cxl_dcd_evt_type_str(u8 type) { switch (type) { diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 733d77c07493..317630d8bf2e 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -222,7 +222,7 @@ static struct cxl_region_ref *cxl_rr_load(struct cxl_po= rt *port, return xa_load(&port->regions, (unsigned long)cxlr); } =20 -static int cxl_region_invalidate_memregion(struct cxl_region *cxlr) +int cxl_region_invalidate_memregion(struct cxl_region *cxlr) { if (!cpu_cache_has_invalidate_memregion()) { if (IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)) { --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f52.google.com (mail-dl1-f52.google.com [74.125.82.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67D5A38B7B6 for ; Sat, 23 May 2026 09:44:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529457; cv=none; b=Wx8s0D8OJaa/F2KOddVGVinqDx8Jk1VSDTP/Cyla/G/TZ6E/PG0us3DDkIa2dFfnME0i+bOywTthf2kxFRJDCFOwacc9pA7kmi8AlLi7Jw3rZmVFk8wzbg1nzCHIroXvh9QKzyJNPUQK5ts46RxZBkjbgV2x2kAra1ZDeeE8Csk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529457; c=relaxed/simple; bh=slI4nvrX63CgPuf0Zdsw9jQIJ2+RZL8IaZz0qSbJ7Zs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uPj5vZFYESJIsH7pf63wJe6BYkmlYhfV1kkOjuJjEj8v+EN93sQ87H4d8aXAgJeA8o4LAuCQ3uX1vJURd5bFtlppyiXfFUMCW+hwhGTj+GTs5KFZ3P1xqhv+3e2R4XKDmupRYrjqeEriGDQub55nQc8wLq0Aw9SiuyDGMDUdYxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=i3vcKAsQ; arc=none smtp.client-ip=74.125.82.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="i3vcKAsQ" Received: by mail-dl1-f52.google.com with SMTP id a92af1059eb24-1329fc4bf77so2963981c88.1 for ; Sat, 23 May 2026 02:44:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529454; x=1780134254; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QdCEB4+GSe0WqjUBIw1Z2Tl7ajOiH3CDmRtVej+qJq8=; b=i3vcKAsQaR9s6keLi0+j758II1uuS6gttPuLiuAk6EL55BHZrA6hS1DX7Nd5xui/+w bwOJwLEk33nwp5Eqr1D9O4pOp+3LBV+di+3iS0o9lcK7I9TnWSx7P6KPMdnkCtKpRYqy RWteZepPVg/0BdIU4B6lgQc6kwEzHrw8S/CTRTQXhigF6yBJTlEAsp2FeBcNLgp6zIRf e920GWtnIaEeAmEmkDcA7Odht9D/2ISr6/lMhKPSCIHcqZz7mECeubUxal3MDfYiuxBx B3cHSGwJeYzSruV7ZtGDjZMWcZuRaW9FdQoy0bcv9aBTmspbrG6av/8aXruR5ZLpYxxV RDvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529454; x=1780134254; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QdCEB4+GSe0WqjUBIw1Z2Tl7ajOiH3CDmRtVej+qJq8=; b=bVWOqP4UVx1f36gflnz1W+9lUU4zQoxj9kgxSNCejnqtnFxLlOV7SsxAJeZ6LcNaDd vhy7uBoy2iBzx6tOnVhW+0gcRYn5UJCzcCbdJpA9SgqHxbK20XtUrOxJZ7EPDfewVKBc 5dkTTKhKuVYvzhrWws98B4gALuRJRjAK7sD4kT4qv9FSAtMEvN1xoIZwaJa1hbd0R0Uq +xJbMfn7QYX7PxnUS4EgDLaV1DNB0dWoYd2A5SRKjz1Gwz/KUpl7Z8fGgyYo60grnThx PaepLBHmd9g5jCidtmKIBSv7ocIqQf/xnydPTZoXnXJvGm568aVltp1wldLY4aepd4DQ NpvA== X-Forwarded-Encrypted: i=1; AFNElJ9RhfPps09cTfGklGESXeSu5tuPAnRTcU5jpYspZYp5SdkoEaLJH7RmdBI0Hkb1xbrxlBh0XMTToVuViP4=@vger.kernel.org X-Gm-Message-State: AOJu0Yy+cTIchgdE7mPKW9I+l57HlffaFepROPrz+cTDvTdczkAN5y2B B2OJcqaBqAhZ7xB3XZR8Xr34YXb9yPPNzLw1IAGtHdDnCyrSF7CvXsa3 X-Gm-Gg: Acq92OHgNIV43vBnL3CmN74iJVPl6XL6w4Vj+aqygs+u+0TSaJtGxSw69zcfxqfwLlK p97nn3dqpH8OkXakSKXL91vk2f4k+A5btO6GXFjw84c7DhWY+gYMig6zVz0cjOT2sOU6IskNfX4 pOJg/rlZ319Bxc6GWPrsRsAuDoHAHotf16IZVp7sbmp/SYnkAvNIH4pEt2yS0Qup86SQSGiHS7P DcG9cif26fTHizk4RXwA1cGZDiGxNNt7qB596zg6ifNfaQyAgfvH96CP6bwMT5jPmhrVBaebhLY 7If8//BCeOvCWYRkyXBCs2353k2wfNKMTRQnd21LJLH/GJpGr6MKbN4QJNhhe7jIsPD+U1kqQrL joKL1zFSHMGnznm2kO9Lb+WNRXqM5WhLjCdn0HJyOfyzmY0tW3LO6hzJ10UJWnk7cvibCcX4hFk aODP8SytUSYIwmx7dhN/RvUqsXu5/KGgoflLT64qlHSRX7nVnozkHwmLET7dDatMfyep9nMAzzj RQatCu7471Z3s83LagR8sDpJ8I4 X-Received: by 2002:a05:7022:60d:b0:130:ab68:2b6f with SMTP id a92af1059eb24-1365f812d70mr2615308c88.9.1779529454246; Sat, 23 May 2026 02:44:14 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:13 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su Subject: [PATCH v10 19/31] cxl/extent: Enforce cross-region tag uniqueness Date: Sat, 23 May 2026 02:43:13 -0700 Message-ID: <8f4aa2f5da26221efdd85650578c953657466e0f.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The per-region scan in cxl_tag_already_committed() only catches a tag re-appearing on the same cxlr_dax. The orchestrator owns tag allocation and is responsible for global uniqueness, but a buggy FM (or firmware redelivering a tag for a previously-closed allocation) can still hand the same uuid to extents on two different regions or memdevs, and the per-region check accepts the second one =E2=80=94 leaving two independent cxl_dc_tag_group objects with the same uuid. Add a host-wide registry of live tag groups with non-null uuids. alloc_tag_group() inserts on success, free_tag_group() removes; both skip the null-uuid case since the spec defines no cross-chain identity for untagged allocations. An attempt to add a second group with the same uuid fails with -EBUSY. No exit hook is needed: cxl_core only unloads after every dependent module has, by which point every live tag group has been freed and the registry is empty. Signed-off-by: Anisa Su --- drivers/cxl/core/core.h | 5 ++++ drivers/cxl/core/extent.c | 60 +++++++++++++++++++++++++++++++++++++++ drivers/cxl/core/mbox.c | 19 +++++++++++++ drivers/cxl/cxl.h | 3 ++ 4 files changed, 87 insertions(+) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 65daaaadf68e..02b36728c22d 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -69,6 +69,7 @@ int devm_cxl_add_pmem_region(struct cxl_region *cxlr); =20 int cxl_add_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent, u16 seq_num); +bool cxl_tag_already_committed(const uuid_t *tag); int cxl_rm_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent); int online_tag_group(struct cxl_dc_tag_group *group); #else @@ -91,6 +92,10 @@ static inline int online_tag_group(struct cxl_dc_tag_gro= up *group) { return 0; } +static inline bool cxl_tag_already_committed(const uuid_t *tag) +{ + return false; +} static inline struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa, struct cxl_endpoint_decoder **cxled) diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index 51116c8139ed..f66fa8c600c5 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -18,8 +18,60 @@ static void cxled_release_extent(struct cxl_endpoint_dec= oder *cxled, memdev_release_extent(mds, &dc_extent->dpa_range); } =20 +/* + * Host-wide registry of live tag groups with non-null uuids. Enforces + * that within this host, a tag uuid identifies exactly one allocation + * across all regions and memdevs =E2=80=94 closing the gap left by the + * per-region scans in cxlr_add_extent() and uuid_claim_tagged(). The + * orchestrator (FM) owns tag-uuid allocation per spec; this is a + * defense against firmware bugs and orchestrator misbehavior. Untagged + * (null uuid) allocations are not tracked: the spec defines no + * cross-chain identity for them. + */ +static DEFINE_MUTEX(cxl_tag_lock); +static LIST_HEAD(cxl_tag_groups); + +static int cxl_tag_register(struct cxl_dc_tag_group *grp) +{ + struct cxl_dc_tag_group *g; + + if (uuid_is_null(&grp->uuid)) + return 0; + + guard(mutex)(&cxl_tag_lock); + list_for_each_entry(g, &cxl_tag_groups, registry_node) + if (uuid_equal(&g->uuid, &grp->uuid)) + return -EBUSY; + list_add_tail(&grp->registry_node, &cxl_tag_groups); + return 0; +} + +static void cxl_tag_unregister(struct cxl_dc_tag_group *grp) +{ + if (uuid_is_null(&grp->uuid)) + return; + + guard(mutex)(&cxl_tag_lock); + list_del(&grp->registry_node); +} + +bool cxl_tag_already_committed(const uuid_t *tag) +{ + struct cxl_dc_tag_group *g; + + if (uuid_is_null(tag)) + return false; + + guard(mutex)(&cxl_tag_lock); + list_for_each_entry(g, &cxl_tag_groups, registry_node) + if (uuid_equal(&g->uuid, tag)) + return true; + return false; +} + static void free_tag_group(struct cxl_dc_tag_group *group) { + cxl_tag_unregister(group); xa_destroy(&group->dc_extents); kfree(group); } @@ -54,12 +106,20 @@ alloc_tag_group(struct cxl_dax_region *cxlr_dax, uuid_= t *uuid) { struct cxl_dc_tag_group *group __free(kfree) =3D kzalloc(sizeof(*group), GFP_KERNEL); + int rc; + if (!group) return ERR_PTR(-ENOMEM); =20 group->cxlr_dax =3D cxlr_dax; uuid_copy(&group->uuid, uuid); xa_init(&group->dc_extents); + INIT_LIST_HEAD(&group->registry_node); + + rc =3D cxl_tag_register(group); + if (rc) + return ERR_PTR(rc); + return no_free_ptr(group); } =20 diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 70e6c4c9743c..85959dee35ea 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1474,6 +1474,25 @@ static int cxl_add_pending(struct cxl_memdev_state *= mds) extract_tag_group(pending, &tag, &group); list_sort(NULL, &group, extent_seq_compare); =20 + /* + * Cross-More-chain uniqueness. A non-null tag seen in this + * group must not already correspond to a committed tag group + * anywhere on this host. More=3D0 was supposed to close that + * allocation, and tag uuids must be unique across all regions + * and memdevs (the orchestrator owns assignment per spec). + * Either constraint failing =E2=80=94 same chain redelivered, or two + * distinct allocations colliding on the same uuid =E2=80=94 is a + * firmware/orchestrator bug; reject the whole group. + */ + if (cxl_tag_already_committed(&tag)) { + dev_warn(dev, + "Tag %pUb: dropping group, tag already committed (firmware/orchestrat= or bug)\n", + &tag); + list_for_each_entry_safe(pos, tmp, &group, list) + delete_extent_node(pos); + continue; + } + /* Sequence-number integrity */ if (cxl_check_group_seq(dev, &tag, &group)) { list_for_each_entry_safe(pos, tmp, &group, list) diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index cbbfba92fea9..a28e7b12a4a8 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -598,12 +598,15 @@ struct cxl_dax_region { * allocations. * @nr_extents: live count of dc_extents in the group; the group is freed * when the last dc_extent device is released. + * @registry_node: anchor in the host-wide non-null-tag registry that + * enforces tag uuid uniqueness across all regions and memdevs. */ struct cxl_dc_tag_group { struct cxl_dax_region *cxlr_dax; uuid_t uuid; struct xarray dc_extents; unsigned int nr_extents; + struct list_head registry_node; }; =20 bool is_dc_extent(struct device *dev); --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f47.google.com (mail-dl1-f47.google.com [74.125.82.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BAFEE38837D for ; Sat, 23 May 2026 09:44:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529458; cv=none; b=sjdPsgLYmJMKbwLF37Yr1A5Xos+a3Qn1s0T145iArBnC3H3dBcTkGeVVGfEniix1trVrx0SOzAYVLs2OsI9Vec7Q+wn2buo8vt7G/7F7WhEAtnJHX7bXc0Nn3+FGW+9ny89kh8Cy1p8pSfZ31FC916fZj+XkF9L3nDskiJ4EvHw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529458; c=relaxed/simple; bh=8ZZdtYpNGTgL3MTMyM19kxOHIuOMKMfxKPRjn4eIfvo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gFNu00nnn8ABgwP3Ng2Cm7Vzk/fA2krppVnYkQo7FyaY+WrUnzB54XlTcEUT9XWSbhAzPD4BXYFOD55tKKVpEFtsXn3qSDH1vysUcsP5ZhT+Ubpx3HGJmQK6VgFZtvoXaBgMRzCT1oszq8FecnFuEpH0Nc/Qzfgq5gdLWyKjbao= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=I6lngsq8; arc=none smtp.client-ip=74.125.82.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="I6lngsq8" Received: by mail-dl1-f47.google.com with SMTP id a92af1059eb24-1334825de43so7175088c88.0 for ; Sat, 23 May 2026 02:44:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529456; x=1780134256; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dZtq66UJzhpDDIp080UOgE5HZPWgaTtRKq2769BN13g=; b=I6lngsq8IUSrb1Pe9KBAji3Hw56yLdfW0AlBBGg7DwDxTUtBzVhngZsPpRgUfio+al bbv/BFaa3nkOrOzZpYo5Kpr2kFYmeiMDfLwAQcy6HpBE3/Usu8QY5dSYFKJG2EuCraNJ EOhNWgWwNFGF7lc7fFoI8lo65YZdfWDTemvagnqfM8eVD+Lrd41fMxwW2AwlFngD18Im 64oYpNcLAzuZmtnFqO0aCwzD02rkOoo+cpVJp2UxxTvprPFKxnSgEZYDmTixynUnKhrb p7+I/19rBSLbY1iPbpMmP7I4xRhai4GZIpRjl8rCHA7RszC2ZwAhYkZZxpWqw7Mawp+o dvyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529456; x=1780134256; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=dZtq66UJzhpDDIp080UOgE5HZPWgaTtRKq2769BN13g=; b=HYWYULUMyY0fsbXp3aa6QMMSVyg5kwBEwkX22xWfn8L5tdCg43t/TCFwe6G86TfBwN hlJdUKaHU5O0PhQME7xgbGzlUZ41tWHnw3ZNKKkJv1lznCxnUyIcHgCVG0Ph+sE/yTtR v9dWLyPckEq4fFI4w9ZrhG07074H3IcHLufyEb6wUY7Hfcff+Yc7cn8mXm6RVR/RO1gA IF1XvaFLm3/fhJHe9C4JUR3Cuou2KewP+Ky1b4XUlU7X07eBQH2lhchZurTteyn2/pmo ZNBxeWrDqNZERUtAVhw4bTNvWHzSswMfLkR97Y9e3hk8/M2RPF1dkwXEOAf68O6Y7keP gRZg== X-Forwarded-Encrypted: i=1; AFNElJ8zA/5JJatnNCFJtpXlyTW96GKGsP8VZKSqz2REPe+V4GJ6rSzt+qABr1g+JFicpdAlGSCLN6F4i2s8jgs=@vger.kernel.org X-Gm-Message-State: AOJu0Yze0Rk43nnnuZ7h3iaKbKZ8Xy966QUccHb+Z/C4Bx01y21KAiVr 9ekyQiVToPF7dU6buPe1DuTUK0Cr0hoW6JwKSy2snlZzGQOi3wt5VGV2 X-Gm-Gg: Acq92OH+FaGs+4aPgfay7uMAqmDkdh1pGjN3wvwqAxZqn/xQb74WyO0B6BmWY1ivzFG PGKvzuQdLE3rzokGUOIBtQBk42wEKoW7sozhWoJOGBwdNPqHt92diqXMNBX9igh9ANEngQocjhe S2SKi8ex7NP+CwsitKpHBJ/nE9v9K3m0pL43FPThhPGOgGyAS1kjDkTsMVR6juChfIOPjh2edBu kHdc0zFjjGzjuly5+mVdDzg9wVKyjxd6qZgNcvIsTXl6ssr8NXqMR0ScSb5nBX41cIRxbJ5Q6PA ziFrIUJZwOrF7L2Sbsbn8r4KZ/qk7ds6VeeZjRJtNBzdBdV+jVFnxbyjK2EPUst7IpIIyHowPEk GKjqiigv44nktfSigx0jiCYWLoIKDAqdiQqnQIB0J+vcseGnZky0V//hmOFmlODCgnxGGvAt9GN nTsTm0bkFjM0Pz/FeT2W/Z+/gg58THucExciW5PBRxfWao+qSyf1rYE/loyzCzYKCkahhmSnbTe sIpbonAkwcUL1ouALFtKbtPSoAG X-Received: by 2002:a05:7022:983:b0:130:7246:10aa with SMTP id a92af1059eb24-1365f703a38mr2595686c88.12.1779529455648; Sat, 23 May 2026 02:44:15 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:15 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron , Fan Ni Subject: [PATCH v10 20/31] cxl/region/extent: Expose dc_extent information in sysfs Date: Sat, 23 May 2026 02:43:14 -0700 Message-ID: <52f5a9ba175424c0f0a181e32ed6c04f26993d96.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Ira Weiny Extent information can be helpful to the user to coordinate memory usage with the external orchestrator and FM. Expose the details of each dc_extent by creating the following sysfs entries. /sys/bus/cxl/devices/dax_regionX/extentX.Y /sys/bus/cxl/devices/dax_regionX/extentX.Y/offset /sys/bus/cxl/devices/dax_regionX/extentX.Y/length /sys/bus/cxl/devices/dax_regionX/extentX.Y/uuid Each dc_extent surfaces as its own extentX.Y device under the parent dax_region. offset and length describe that dc_extent's HPA range, not an aggregate bounding box across the containing tagged allocation =E2=80=94 so when a tagged allocation has multiple DPA-discontiguous extents, each is reported with its own offset and length. uuid is the tag identifying the containing allocation; it is shared across dc_extents that belong to the same tagged allocation and is hidden for untagged extents. Based on an original patch by Navneet Singh. Reviewed-by: Jonathan Cameron Reviewed-by: Fan Ni Tested-by: Fan Ni Signed-off-by: Ira Weiny --- Documentation/ABI/testing/sysfs-bus-cxl | 36 +++++++++++++++ drivers/cxl/core/extent.c | 58 +++++++++++++++++++++++++ 2 files changed, 94 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/te= sting/sysfs-bus-cxl index 3080aef9ad67..38cf0a2894b9 100644 --- a/Documentation/ABI/testing/sysfs-bus-cxl +++ b/Documentation/ABI/testing/sysfs-bus-cxl @@ -661,3 +661,39 @@ Description: The count is persistent across power loss and wraps back to 0 upon overflow. If this file is not present, the device does not have the necessary support for dirty tracking. + + +What: /sys/bus/cxl/devices/dax_regionX/extentX.Y/offset +Date: May, 2025 +KernelVersion: v6.16 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) [For Dynamic Capacity regions only] Users can use the + extent information to create DAX devices on specific extents. + This is done by creating and destroying DAX devices in specific + sequences and looking at the mappings created. Extent offset + within the region. + + +What: /sys/bus/cxl/devices/dax_regionX/extentX.Y/length +Date: May, 2025 +KernelVersion: v6.16 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) [For Dynamic Capacity regions only] Users can use the + extent information to create DAX devices on specific extents. + This is done by creating and destroying DAX devices in specific + sequences and looking at the mappings created. Extent length + within the region. + + +What: /sys/bus/cxl/devices/dax_regionX/extentX.Y/uuid +Date: May, 2025 +KernelVersion: v6.16 +Contact: linux-cxl@vger.kernel.org +Description: + (RO) [For Dynamic Capacity regions only] Users can use the + extent information to create DAX devices on specific extents. + This is done by creating and destroying DAX devices in specific + sequences and looking at the mappings created. UUID of this + extent. diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index f66fa8c600c5..34babfe032d1 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -6,6 +6,63 @@ =20 #include "core.h" =20 +static ssize_t offset_show(struct device *dev, struct device_attribute *at= tr, + char *buf) +{ + struct dc_extent *dc_extent =3D to_dc_extent(dev); + + return sysfs_emit(buf, "%#llx\n", dc_extent->hpa_range.start); +} +static DEVICE_ATTR_RO(offset); + +static ssize_t length_show(struct device *dev, struct device_attribute *at= tr, + char *buf) +{ + struct dc_extent *dc_extent =3D to_dc_extent(dev); + u64 length =3D range_len(&dc_extent->hpa_range); + + return sysfs_emit(buf, "%#llx\n", length); +} +static DEVICE_ATTR_RO(length); + +static ssize_t uuid_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct dc_extent *dc_extent =3D to_dc_extent(dev); + + return sysfs_emit(buf, "%pUb\n", &dc_extent->group->uuid); +} +static DEVICE_ATTR_RO(uuid); + +static struct attribute *dc_extent_attrs[] =3D { + &dev_attr_offset.attr, + &dev_attr_length.attr, + &dev_attr_uuid.attr, + NULL +}; + +static uuid_t empty_uuid =3D { 0 }; + +static umode_t dc_extent_visible(struct kobject *kobj, + struct attribute *a, int n) +{ + struct device *dev =3D kobj_to_dev(kobj); + struct dc_extent *dc_extent =3D to_dc_extent(dev); + + if (a =3D=3D &dev_attr_uuid.attr && + uuid_equal(&dc_extent->group->uuid, &empty_uuid)) + return 0; + + return a->mode; +} + +static const struct attribute_group dc_extent_attribute_group =3D { + .attrs =3D dc_extent_attrs, + .is_visible =3D dc_extent_visible, +}; + +__ATTRIBUTE_GROUPS(dc_extent_attribute); + =20 static void cxled_release_extent(struct cxl_endpoint_decoder *cxled, struct dc_extent *dc_extent) @@ -93,6 +150,7 @@ static void dc_extent_release(struct device *dev) static const struct device_type dc_extent_type =3D { .name =3D "extent", .release =3D dc_extent_release, + .groups =3D dc_extent_attribute_groups, }; =20 bool is_dc_extent(struct device *dev) --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f48.google.com (mail-dl1-f48.google.com [74.125.82.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B1F5388390 for ; Sat, 23 May 2026 09:44:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529460; cv=none; b=eNPf9H1dEPf5vuxYk41rj1J8xDxyicGGatZBDSNAcHOLxTkc5LpawoaGGYnJr55qbaRMFy22qPNH6izJL07T5B2NSBBIygNd0/POyrDWlJG4yQsP05i19RNB/GOEMAtjgekL/B8LfFZenWXqrnVJkyhheEv7Ks9tRcWxPtjCF8E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529460; c=relaxed/simple; bh=8Jow8DG3d2WptgCMsc9FYVh7XS6QqwIucOECjDoRsOc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DeTpBtpj9/3O34e384heSnIIRiCKq0cFoaGIrA17PSq2ZSc/jOsvKdeU9Z0kHKuXPIOgp4jqknUVkqZMWJX4s/SG82vxnN0aFjZqEMBw0iHzaWdMQM1mMMRt9E2PGQctb5ZiqaHkgvtvi0CVY+WcSs4de2i4K/ADWadp4nrCHH8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=EyrnoEKN; arc=none smtp.client-ip=74.125.82.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EyrnoEKN" Received: by mail-dl1-f48.google.com with SMTP id a92af1059eb24-135e88b8e55so3425373c88.0 for ; Sat, 23 May 2026 02:44:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529457; x=1780134257; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1A+E8vTai+96KdZn19YhxgqouBodr4mZJg2tWaOKXWQ=; b=EyrnoEKN3faIwH5dWS0ECA2iQ3zKqwqalNl8sUAz/L7ShDsQH0KZcz+Vy1eeNJGPGf aimHM51+eK4WBsCowVLbcp+CyDetuu0UHWxfxRr7wTyWnoifm6+MpXGugx5gZvGueGkC Lzg+tbFDvF/R0SyC8BrwG6t+fXv8+xv8afbmQgCuBnxRpAspj5oodBal/hWlQAxU/A4z BLM6Z/NRmXnPo3yHbzLbeamYwql4zY5uf00/oNpBjjB9+RO+g6kCHvv5DyCTw1ET8t6E Uw5xz+8jP8phmKgnxg9wDbQI5SJAyzKwV5ywztb28bHNe/E9+vyCxYXwbwz5k28B3Zg2 64hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529457; x=1780134257; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=1A+E8vTai+96KdZn19YhxgqouBodr4mZJg2tWaOKXWQ=; b=n8nFnJOffK7yx+/8LZBdFJaE6MPojXxv/Kx0Cm3H42F0Z5ftHSZsweVkwqL8mwnLKF Jds1LLvdVtQeU/dHz4l3FvCJNBr+IgriCTahyRwwko6aiyVDWb5/ICO7U8Wb5P4KOZAm 2UWOAEN/xxFqseX37xwUMXjsCCy8eiGulbykLnjCJ6uhWKQxek9y5yyCXT+w8DTPrpgR frjghvcwX6nL9Jy6w25/kCygNgHgd9hbf1H2MZckRenQZBaGm78618jnYYTtSrkOEYt+ eLGiI+YJ+x5/kFh2QuC/afcg3Ypr0mPlObNEPArNAiWvgReEulRqqY2fpY9DlafUHCKo RZ3Q== X-Forwarded-Encrypted: i=1; AFNElJ957G5mf7yyfKexAYT77/RdeB371C451n7jOZNjxU0KSwRMpkWE3lqMGdx7OhZJBVO00DoI1d9z8ocX0n4=@vger.kernel.org X-Gm-Message-State: AOJu0YwQWz13a4xlj2EXIEesbnXrMi84I56bl3TVlMrpUMPH2znwOjHC J5EzrSDbVZUcTIWrx1n9XGlWeX4PamiLefP16gukW6lpB56m79B35kbC X-Gm-Gg: Acq92OET3o5bC/AH/fbu+S/mKsi1ZKc/Bos+a3U3wH3z1o2iCTLybtYJLkGLjaWw6/Z iFCz5nbCjVtM5gYDiWk7hTqEeLuxcHOJOBae5FeSo8z/czylSPEOE8hcCd9a1sv/8brTid0UU5k Uk6oZaCEqHzXF+lpaEe4bR89VMe/3AjDhuIjo3keFknXeQgHC9IG9yLh4rmGn247NC8puwHyrOV HV/ksSzYSFsSlmZDbX5TASmaR7CR15ddMUdFH4Z8c+ZXZwRDcdkuwlD8n/1ISaMIirc4fDNTL5m kmL9MrMAuAtzD5r5bPhCQdCC/uYJGZoOqqr5JqDSIGqGLzR9fvHWYFUKjmaAf8IwYi3lSKbNLfl Zx4rhxCSLJCTxgcfeUot1Hz3gjp+WBNQdXfNQYr/Qp6CoY9c96lD9NOseL3/MpFIdG2BoOqbT63 jZp8I46jSExlgBosxGy0syviu1E3sAZnFRdfL3ZsPcb67y5k7HD5uGO8/PYNnao+ZIJxizj2kV9 98BV4s= X-Received: by 2002:a05:701b:2617:b0:136:9b4a:21cf with SMTP id a92af1059eb24-1369b4a2c90mr242358c88.3.1779529457233; Sat, 23 May 2026 02:44:17 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:16 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 21/31] cxl + dax: Surface dax_resources on DCD Add Capacity events Date: Sat, 23 May 2026 02:43:15 -0700 Message-ID: <9195bbfbed68420432eba01ebd5d50b2e284ccc0.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When an extent is accepted/released, the CXL driver must notify the DAX driver to coordinate the management of resources. Define the .notify callback to the cxl_dax region driver to enable the coordination. Define struct dax_resource, a sub-resource of a DC dax_region representing the capacity of one dc_extent. When the cxl side onlines a tag group during a DC Add event, notify the DAX region to register a struct dax_resource for each extent under the dax_region's resource tree. The dax_resource model: * struct dax_resource (dax-private.h) =E2=80=94 per-extent sub-resource of a DC dax_region: pointer back to its region, the kernel struct resource, the tag uuid, the per-allocation seq_num, and a use_cnt that lets a later commit refuse release of an in-use extent. * struct dev_dax_range gains a dax_resource back-pointer so a carved range remembers which extent it lives in. For now, dax_resources live under the dax_region and remain inaccessible to DAX devices. A later commit adds the support to specify a tag when creating a DAX device, which then allows dax_resources to be claimed by tag. Release is handled in the following commit. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa=EF=BC=9Arestructured from the original "Create resources on sparse D= AX regions" commit=E3=80=91 --- drivers/cxl/core/core.h | 10 ++++ drivers/cxl/core/extent.c | 33 ++++++++++- drivers/cxl/core/mbox.c | 17 +++++- drivers/cxl/cxl.h | 6 ++ drivers/dax/bus.c | 118 ++++++++++++++++++++++++++++++++++---- drivers/dax/bus.h | 3 +- drivers/dax/cxl.c | 88 +++++++++++++++++++++++++++- drivers/dax/dax-private.h | 49 ++++++++++++++++ drivers/dax/hmem/hmem.c | 2 +- drivers/dax/pmem.c | 2 +- 10 files changed, 306 insertions(+), 22 deletions(-) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 02b36728c22d..c28e357c5817 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -72,6 +72,9 @@ int cxl_add_extent(struct cxl_memdev_state *mds, struct c= xl_extent *extent, bool cxl_tag_already_committed(const uuid_t *tag); int cxl_rm_extent(struct cxl_memdev_state *mds, struct cxl_extent *extent); int online_tag_group(struct cxl_dc_tag_group *group); +void rm_tag_group(struct cxl_dc_tag_group *group); +int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event, + struct cxl_dc_tag_group *group); #else static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, u64 dpa) @@ -96,6 +99,13 @@ static inline bool cxl_tag_already_committed(const uuid_= t *tag) { return false; } +static inline void rm_tag_group(struct cxl_dc_tag_group *group) { } +static inline int cxlr_notify_extent(struct cxl_region *cxlr, + enum dc_event event, + struct cxl_dc_tag_group *group) +{ + return 0; +} static inline struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 d= pa, struct cxl_endpoint_decoder **cxled) diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index 34babfe032d1..3fc4b7292664 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -63,7 +63,6 @@ static const struct attribute_group dc_extent_attribute_g= roup =3D { =20 __ATTRIBUTE_GROUPS(dc_extent_attribute); =20 - static void cxled_release_extent(struct cxl_endpoint_decoder *cxled, struct dc_extent *dc_extent) { @@ -359,6 +358,36 @@ dc_extent_build(struct cxl_endpoint_decoder *cxled, return dc_extent; } =20 +int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event, + struct cxl_dc_tag_group *group) +{ + struct device *dev =3D &cxlr->cxlr_dax->dev; + struct cxl_notify_data notify_data; + struct cxl_driver *driver; + + dev_dbg(dev, "Trying notify: type %d tag %pUb\n", event, &group->uuid); + + guard(device)(dev); + + /* + * The lack of a driver indicates a notification has failed. No user + * space coordination was possible. + */ + if (!dev->driver) + return 0; + driver =3D to_cxl_drv(dev->driver); + if (!driver->notify) + return 0; + + notify_data =3D (struct cxl_notify_data) { + .event =3D event, + .group =3D group, + }; + + dev_dbg(dev, "Notify: type %d tag %pUb\n", event, &group->uuid); + return driver->notify(dev, ¬ify_data); +} + /* * Stage 4: insert @dc_extent into the pending tag group. All extents in * one More-chain group share a UUID =E2=80=94 enforced here as the group = is @@ -462,7 +491,7 @@ static void dc_extent_unregister(void *ext) device_unregister(&dc_extent->dev); } =20 -static void rm_tag_group(struct cxl_dc_tag_group *group) +void rm_tag_group(struct cxl_dc_tag_group *group) { struct device *region_dev =3D &group->cxlr_dax->dev; struct dc_extent *dc_extent; diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 85959dee35ea..8071c1ed1b36 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1558,8 +1558,21 @@ static int cxl_add_pending(struct cxl_memdev_state *= mds) list_for_each_entry_safe(pos, tmp, &group, list) delete_extent_node(pos); } else { - list_splice_tail_init(&group, &accepted); - total_accepted +=3D group_cnt; + rc =3D cxlr_notify_extent(tag_group->cxlr_dax->cxlr, + DCD_ADD_CAPACITY, tag_group); + if (rc) { + /* + * The dax-side notification failed; tear down the + * tag group and drop the extents so we do not + * mis-report acceptance to the device. + */ + rm_tag_group(tag_group); + list_for_each_entry_safe(pos, tmp, &group, list) + delete_extent_node(pos); + } else { + list_splice_tail_init(&group, &accepted); + total_accepted +=3D group_cnt; + } } =20 mds->add_ctx.group =3D NULL; diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index a28e7b12a4a8..27e3046654e9 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -892,6 +892,11 @@ bool is_cxl_region(struct device *dev); =20 extern const struct bus_type cxl_bus_type; =20 +struct cxl_notify_data { + enum dc_event event; + struct cxl_dc_tag_group *group; +}; + /* * Note, add_dport() is expressly for the cxl_port driver. TODO: investiga= te a * type-safe driver model where probe()/remove() take the type of object i= mplied @@ -904,6 +909,7 @@ struct cxl_driver { void (*remove)(struct device *dev); struct cxl_dport *(*add_dport)(struct cxl_port *port, struct device *dport_dev); + int (*notify)(struct device *dev, struct cxl_notify_data *notify_data); struct device_driver drv; int id; }; diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index b0c2162b5e37..a6ee59f2d8a1 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -186,6 +186,73 @@ static bool is_dynamic(struct dax_region *dax_region) return (dax_region->res.flags & IORESOURCE_DAX_DCD) !=3D 0; } =20 +static void __dax_release_resource(struct dax_resource *dax_resource) +{ + struct dax_region *dax_region =3D dax_resource->region; + + lockdep_assert_held_write(&dax_region_rwsem); + dev_dbg(dax_region->dev, "Extent release resource %pr\n", + dax_resource->res); + if (dax_resource->res) + __release_region(&dax_region->res, dax_resource->res->start, + resource_size(dax_resource->res)); + dax_resource->res =3D NULL; +} + +static void dax_release_resource(void *res) +{ + struct dax_resource *dax_resource =3D res; + + guard(rwsem_write)(&dax_region_rwsem); + __dax_release_resource(dax_resource); + kfree(dax_resource); +} + +int dax_region_add_resource(struct dax_region *dax_region, + struct device *device, + resource_size_t start, resource_size_t length, + const uuid_t *tag, u16 seq_num) +{ + struct resource *new_resource; + int rc; + + struct dax_resource *dax_resource __free(kfree) =3D + kzalloc(sizeof(*dax_resource), GFP_KERNEL); + if (!dax_resource) + return -ENOMEM; + + guard(rwsem_write)(&dax_region_rwsem); + + dev_dbg(dax_region->dev, "DAX region resource %pr\n", &dax_region->res); + new_resource =3D __request_region(&dax_region->res, start, length, "exten= t", 0); + if (!new_resource) { + dev_err(dax_region->dev, "Failed to add region s:%pa l:%pa\n", + &start, &length); + return -ENOSPC; + } + + dev_dbg(dax_region->dev, "add resource %pr\n", new_resource); + dax_resource->region =3D dax_region; + dax_resource->res =3D new_resource; + dax_resource->seq_num =3D seq_num; + if (tag) + uuid_copy(&dax_resource->uuid, tag); + + /* + * open code devm_add_action_or_reset() to avoid recursive write lock + * of dax_region_rwsem in the error case. + */ + rc =3D devm_add_action(device, dax_release_resource, dax_resource); + if (rc) { + __dax_release_resource(dax_resource); + return rc; + } + + dev_set_drvdata(device, no_free_ptr(dax_resource)); + return 0; +} +EXPORT_SYMBOL_GPL(dax_region_add_resource); + bool static_dev_dax(struct dev_dax *dev_dax) { return is_static(dev_dax->region); @@ -304,14 +371,25 @@ static struct device_attribute dev_attr_region_align = =3D =20 static unsigned long long dax_region_avail_size(struct dax_region *dax_reg= ion) { - resource_size_t size =3D resource_size(&dax_region->res); + resource_size_t size; struct resource *res; =20 lockdep_assert_held(&dax_region_rwsem); =20 - if (is_dynamic(dax_region)) - return 0; + if (is_dynamic(dax_region)) { + /* + * Children of a dynamic region are extents, claimed + * all-or-nothing: an extent's resource is either unclaimed (no + * child) or fully consumed by exactly one dax device. + */ + size =3D 0; + for_each_dax_region_resource(dax_region, res) + if (!res->child) + size +=3D resource_size(res); + return size; + } =20 + size =3D resource_size(&dax_region->res); for_each_dax_region_resource(dax_region, res) size -=3D resource_size(res); return size; @@ -452,15 +530,26 @@ EXPORT_SYMBOL_GPL(kill_dev_dax); static void trim_dev_dax_range(struct dev_dax *dev_dax) { int i =3D dev_dax->nr_range - 1; - struct range *range =3D &dev_dax->ranges[i].range; + struct dev_dax_range *dev_range =3D &dev_dax->ranges[i]; + struct range *range =3D &dev_range->range; struct dax_region *dax_region =3D dev_dax->region; + struct resource *res =3D &dax_region->res; =20 lockdep_assert_held_write(&dax_region_rwsem); dev_dbg(&dev_dax->dev, "delete range[%d]: %#llx:%#llx\n", i, (unsigned long long)range->start, (unsigned long long)range->end); =20 - __release_region(&dax_region->res, range->start, range_len(range)); + if (dev_range->dax_resource) { + res =3D dev_range->dax_resource->res; + dev_dbg(&dev_dax->dev, "Trim dc extent %pr\n", res); + } + + __release_region(res, range->start, range_len(range)); + + if (dev_range->dax_resource) + dev_range->dax_resource->use_cnt--; + if (--dev_dax->nr_range =3D=3D 0) { kfree(dev_dax->ranges); dev_dax->ranges =3D NULL; @@ -644,11 +733,14 @@ static void dax_region_unregister(void *region) =20 struct dax_region *alloc_dax_region(struct device *parent, int region_id, struct range *range, int target_node, unsigned int align, - unsigned long flags) + unsigned long flags, struct dax_dc_ops *dc_ops) { struct dax_region *dax_region; int rc; =20 + if (!dc_ops && (flags & IORESOURCE_DAX_DCD)) + return NULL; + /* * The DAX core assumes that it can store its private data in * parent->driver_data. This WARN is a reminder / safeguard for @@ -673,6 +765,7 @@ struct dax_region *alloc_dax_region(struct device *pare= nt, int region_id, dax_region->align =3D align; dax_region->dev =3D parent; dax_region->target_node =3D target_node; + dax_region->dc_ops =3D dc_ops; ida_init(&dax_region->ida); dax_region->res =3D (struct resource) { .start =3D range->start, @@ -861,7 +954,7 @@ static int devm_register_dax_mapping(struct dev_dax *de= v_dax, int range_id) } =20 static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start, - resource_size_t size) + resource_size_t size, struct dax_resource *dax_resource) { struct dax_region *dax_region =3D dev_dax->region; struct resource *res =3D &dax_region->res; @@ -902,6 +995,7 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax,= u64 start, .start =3D alloc->start, .end =3D alloc->end, }, + .dax_resource =3D dax_resource, }; =20 dev_dbg(dev, "alloc range[%d]: %pa:%pa\n", dev_dax->nr_range - 1, @@ -1075,7 +1169,7 @@ static ssize_t dev_dax_resize(struct dax_region *dax_= region, retry: first =3D region_res->child; if (!first) - return alloc_dev_dax_range(dev_dax, dax_region->res.start, to_alloc); + return alloc_dev_dax_range(dev_dax, dax_region->res.start, to_alloc, NUL= L); =20 rc =3D -ENOSPC; for (res =3D first; res; res =3D res->sibling) { @@ -1084,7 +1178,7 @@ static ssize_t dev_dax_resize(struct dax_region *dax_= region, /* space at the beginning of the region */ if (res =3D=3D first && res->start > dax_region->res.start) { alloc =3D min(res->start - dax_region->res.start, to_alloc); - rc =3D alloc_dev_dax_range(dev_dax, dax_region->res.start, alloc); + rc =3D alloc_dev_dax_range(dev_dax, dax_region->res.start, alloc, NULL); break; } =20 @@ -1104,7 +1198,7 @@ static ssize_t dev_dax_resize(struct dax_region *dax_= region, rc =3D adjust_dev_dax_range(dev_dax, res, resource_size(res) + alloc); break; } - rc =3D alloc_dev_dax_range(dev_dax, res->end + 1, alloc); + rc =3D alloc_dev_dax_range(dev_dax, res->end + 1, alloc, NULL); break; } if (rc) @@ -1214,7 +1308,7 @@ static ssize_t mapping_store(struct device *dev, stru= ct device_attribute *attr, =20 to_alloc =3D range_len(&r); if (alloc_is_aligned(dev_dax, to_alloc)) - rc =3D alloc_dev_dax_range(dev_dax, r.start, to_alloc); + rc =3D alloc_dev_dax_range(dev_dax, r.start, to_alloc, NULL); up_write(&dax_dev_rwsem); up_write(&dax_region_rwsem); =20 @@ -1506,7 +1600,7 @@ static struct dev_dax *__devm_create_dev_dax(struct d= ev_dax_data *data) device_initialize(dev); dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id); =20 - rc =3D alloc_dev_dax_range(dev_dax, dax_region->res.start, data->size); + rc =3D alloc_dev_dax_range(dev_dax, dax_region->res.start, data->size, NU= LL); if (rc) goto err_range; =20 diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index 6e739bfab932..7a115893a102 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -11,6 +11,7 @@ struct dev_dax; struct resource; struct dax_device; struct dax_region; +struct dax_dc_ops; =20 /* dax bus specific ioresource flags */ #define IORESOURCE_DAX_STATIC BIT(0) @@ -19,7 +20,7 @@ struct dax_region; =20 struct dax_region *alloc_dax_region(struct device *parent, int region_id, struct range *range, int target_node, unsigned int align, - unsigned long flags); + unsigned long flags, struct dax_dc_ops *dc_ops); =20 struct dev_dax_data { struct dax_region *dax_region; diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index f58fe992aa8d..690cf625e052 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -5,6 +5,84 @@ =20 #include "../cxl/cxl.h" #include "bus.h" +#include "dax-private.h" + +static int __cxl_dax_add_resource(struct dax_region *dax_region, + struct dc_extent *dc_extent) +{ + struct device *dev =3D &dc_extent->dev; + resource_size_t start, length; + + start =3D dax_region->res.start + dc_extent->hpa_range.start; + length =3D range_len(&dc_extent->hpa_range); + return dax_region_add_resource(dax_region, dev, start, length, + &dc_extent->group->uuid, + dc_extent->seq_num); +} + +static int cxl_dax_add_resource(struct device *dev, void *data) +{ + struct dax_region *dax_region =3D data; + struct dc_extent *dc_extent; + + dc_extent =3D to_dc_extent(dev); + if (!dc_extent) + return 0; + + dev_dbg(dax_region->dev, "Adding resource HPA %pra (%pUb)\n", + &dc_extent->hpa_range, &dc_extent->group->uuid); + + return __cxl_dax_add_resource(dax_region, dc_extent); +} + +static int cxl_dax_group_add(struct dax_region *dax_region, + struct cxl_dc_tag_group *group) +{ + struct dc_extent *dc_extent; + unsigned long index; + int rc; + + xa_for_each(&group->dc_extents, index, dc_extent) { + rc =3D __cxl_dax_add_resource(dax_region, dc_extent); + if (rc) + return rc; + } + return 0; +} + +/* + * RELEASE is still a stub here =E2=80=94 the atomic dax_region_rm_resourc= es API + * and its wire-up land in the next commit. An incoming RELEASE returns + * success and the cxl side proceeds to rm_tag_group(), which device- + * unregisters each dc_extent; the devm action armed by + * dax_region_add_resource() then tears down each dax_resource. + */ +static int cxl_dax_region_notify(struct device *dev, + struct cxl_notify_data *notify_data) +{ + struct cxl_dax_region *cxlr_dax =3D to_cxl_dax_region(dev); + struct dax_region *dax_region =3D dev_get_drvdata(dev); + struct cxl_dc_tag_group *group =3D notify_data->group; + + switch (notify_data->event) { + case DCD_ADD_CAPACITY: + return cxl_dax_group_add(dax_region, group); + case DCD_RELEASE_CAPACITY: + dev_dbg(&cxlr_dax->dev, + "DCD RELEASE notify (tag %pUb): no-op (stub)\n", + &group->uuid); + return 0; + case DCD_FORCED_CAPACITY_RELEASE: + default: + dev_err(&cxlr_dax->dev, "Unknown DC event %d\n", + notify_data->event); + return -ENXIO; + } +} + +static struct dax_dc_ops dc_ops =3D { + .is_extent =3D is_dc_extent, +}; =20 static int cxl_dax_region_probe(struct device *dev) { @@ -25,15 +103,18 @@ static int cxl_dax_region_probe(struct device *dev) flags =3D IORESOURCE_DAX_KMEM; =20 dax_region =3D alloc_dax_region(dev, cxlr->id, &cxlr_dax->hpa_range, nid, - PMD_SIZE, flags); + PMD_SIZE, flags, &dc_ops); if (!dax_region) return -ENOMEM; =20 - if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A) + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A) { + device_for_each_child(&cxlr_dax->dev, dax_region, + cxl_dax_add_resource); /* Add empty seed dax device */ dev_size =3D 0; - else + } else { dev_size =3D range_len(&cxlr_dax->hpa_range); + } =20 data =3D (struct dev_dax_data) { .dax_region =3D dax_region, @@ -48,6 +129,7 @@ static int cxl_dax_region_probe(struct device *dev) static struct cxl_driver cxl_dax_region_driver =3D { .name =3D "cxl_dax_region", .probe =3D cxl_dax_region_probe, + .notify =3D cxl_dax_region_notify, .id =3D CXL_DEVICE_DAX_REGION, .drv =3D { .suppress_bind_attrs =3D true, diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index ee8f3af8387f..f2ae5918f94d 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -8,6 +8,7 @@ #include #include #include +#include =20 /* private routines between core files */ struct dax_device; @@ -16,6 +17,14 @@ struct inode *dax_inode(struct dax_device *dax_dev); int dax_bus_init(void); void dax_bus_exit(void); =20 +/** + * struct dax_dc_ops - Operations for dc-backed regions + * @is_extent: return if the device is an extent + */ +struct dax_dc_ops { + bool (*is_extent)(struct device *dev); +}; + /** * struct dax_region - mapping infrastructure for dax devices * @id: kernel-wide unique region for a memory range @@ -27,6 +36,7 @@ void dax_bus_exit(void); * @res: resource tree to track instance allocations * @seed: allow userspace to find the first unbound seed device * @youngest: allow userspace to find the most recently created device + * @dc_ops: operations required for DC-backed regions */ struct dax_region { int id; @@ -38,6 +48,7 @@ struct dax_region { struct resource res; struct device *seed; struct device *youngest; + struct dax_dc_ops *dc_ops; }; =20 /** @@ -57,11 +68,13 @@ struct dax_mapping { * @pgoff: page offset * @range: resource-span * @mapping: reference to the dax_mapping for this range + * @dax_resource: if not NULL; dax DC resource containing this range */ struct dev_dax_range { unsigned long pgoff; struct range range; struct dax_mapping *mapping; + struct dax_resource *dax_resource; }; =20 /** @@ -105,6 +118,42 @@ struct dev_dax { */ void run_dax(struct dax_device *dax_dev); =20 +/** + * struct dax_resource - For DC DAX regions; an active resource + * @region: dax_region this resources is in + * @res: resource + * @uuid: tag identifying the backing extent; zero uuid means untagged + * @seq_num: 1..n assembly-order index within the tag group; 0 for the + * untagged pool (uuid =3D=3D 0). For extents from a sharable + * CXL DC partition this is the device-stamped shared_extn_seq + * (CXL 3.1 Table 8-51). For extents from a non-sharable + * partition the cxl layer fills it in event arrival order, so + * the dax layer can rely on a single 1..n dense invariant when + * it claims a tagged group in uuid_store(). + * @use_cnt: count the number of uses of this resource + * + * Changes to the dax_region and the dax_resources within it are protected= by + * dax_region_rwsem + * + * dax_resource's are not intended to be used outside the dax layer. + */ +struct dax_resource { + struct dax_region *region; + struct resource *res; + uuid_t uuid; + u16 seq_num; + unsigned int use_cnt; +}; + +/* + * Similar to run_dax() dax_region_add_resource() is exported but is not + * intended to be a generic operation outside the dax subsystem. It is on= ly + * generic between the dax layer and the dax drivers. + */ +int dax_region_add_resource(struct dax_region *dax_region, struct device *= dev, + resource_size_t start, resource_size_t length, + const uuid_t *tag, u16 seq_num); + static inline struct dev_dax *to_dev_dax(struct device *dev) { return container_of(dev, struct dev_dax, dev); diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index af21f66bf872..be938c2a73f8 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -28,7 +28,7 @@ static int dax_hmem_probe(struct platform_device *pdev) =20 mri =3D dev->platform_data; dax_region =3D alloc_dax_region(dev, pdev->id, &mri->range, - mri->target_node, PMD_SIZE, flags); + mri->target_node, PMD_SIZE, flags, NULL); if (!dax_region) return -ENOMEM; =20 diff --git a/drivers/dax/pmem.c b/drivers/dax/pmem.c index bee93066a849..5b5be86768f3 100644 --- a/drivers/dax/pmem.c +++ b/drivers/dax/pmem.c @@ -53,7 +53,7 @@ static struct dev_dax *__dax_pmem_probe(struct device *de= v) range.start +=3D offset; dax_region =3D alloc_dax_region(dev, region_id, &range, nd_region->target_node, le32_to_cpu(pfn_sb->align), - IORESOURCE_DAX_STATIC); + IORESOURCE_DAX_STATIC, NULL); if (!dax_region) return ERR_PTR(-ENOMEM); =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f51.google.com (mail-dl1-f51.google.com [74.125.82.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBE7638D01E for ; Sat, 23 May 2026 09:44:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529461; cv=none; b=gBuzQ+pRfgvg1WY5DzhLw1dHx9OW8ymQe1DjuR5fz+Tnrt8BQ7JQknZ/cv4YBaOAisBqo2yN1Zj2vS1UcG2tl7AdQ/RorzM2fSDTLSNY1g5TH8i0tyIvruI5AIMNkfcF/22brne9P39silahzoS9+dqz2SagsI6UPz8nV2v/dLo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529461; c=relaxed/simple; bh=LUSayAW7Ss//wrzrfaEVE3L7lgIMMJWzEY1PnB+ZGhY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=E0rv60or0gFX1HlcYcezYjc3CZdBRe1TDubLZ0QOjF3WcRDGze1pGllhuxFMB6/45ANd0NH8XITilIyUlUIoM5AuSxg8loMGKnJNEfOOvYpYHeJeBfzyPrK4BjUBNxYOPYTMi3nipWpdjuIeZWyQzPvJYRRXr/hAuX9NQc9hNYY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=aWGNpbLQ; arc=none smtp.client-ip=74.125.82.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aWGNpbLQ" Received: by mail-dl1-f51.google.com with SMTP id a92af1059eb24-135e88b8e55so3425411c88.0 for ; Sat, 23 May 2026 02:44:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529459; x=1780134259; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+Dxps5tq57tj4Ih7YoWNcRidGTGqRdOSyJhuFvRk8Es=; b=aWGNpbLQFASZzf4IgGyOl3Y0Yyca27Eq/6nfW2SBu6VkIN9y9TX/c/u+PTIALlXIqI 3yguBkg9CKYY76CkACPwAJsHy3W8+otB5sdkpfdfUUs8dV8ZiEALlQuoIwxXiEYaOerh JN4ZM3YA/8LMRzCJRw7YpNGO/TkQtZnCgOKNFID2cRRvL5Y4WNTx5GkgOBKTeewb5RmZ cq/xfp4VmWISEk3tXxnU+xAlbH9tfDnHehhcEEYUlF/0BVhNCkj+jn8xghCwWiLf5ZtF xQb5gyocajqY3cS2ycZx4iI2DaDm2AqFIN5RvzY+JjwzCj68l1ooHQRSd5XZGapqGgkY 5BQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529459; x=1780134259; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+Dxps5tq57tj4Ih7YoWNcRidGTGqRdOSyJhuFvRk8Es=; b=HacOS8Xd4uh2dpUieQicaZuPlF4twFCKtZQCgHaysSSkztZMpB7FzFaiWN20QxzwGK c6aEFiqf8OleTi6jFHetSLXw1IK+L37AF0RHegd3DInBLf/pX1WyGxVZeiLBxd0zsTb8 FYs1r1ZPUGhFINs7OuF893sdI0sTDdJDxbE2c4dn0t4gc1LH5g4g+lvUfnXQ27tnZSQw CoPnXynT1IUSvF7yMtoHYhVpVXpGv3ma8vgHlG7xPKQEH38z8tq6vpYG4DavDXNnX7Bz cP7MsrzB9UKhuHymVAXHCeIbfQW4lLzcXCM49yRUnC2QeH8uLoli4Y0YjkfzSLmCpZhz 9HRA== X-Forwarded-Encrypted: i=1; AFNElJ/jKYzgg93txKEDk4UmolL768THrT+lib858VhOphQNPhCM/6JVpgI4YSoVcebFpgh+4sc4Tu1yEdrCv50=@vger.kernel.org X-Gm-Message-State: AOJu0Ywc29JjbPtd+gRfVnAP3iSe3iztVGExrZvDAFctOkzPlriQsluX l8fq/TBDWEHG2EE1/I039g588ApGB2rhemp1Nlvfu3jMn3FbUnOVQuSn X-Gm-Gg: Acq92OEdludJaZnVFxhOZijS4xDqRva8YBsFFotFCPWLzdeOHMCaCK6dFRetrLq0jwq eVvclJ3rmwO/+lv6FcYN051pEtaIoUTMzOozCETzI3ZOsZQ6c+qTE3rViCcZ8dFM5HIwggjUPNI pv8ZJRV8k8+F2ZbpNHL/RwLTWLqLzPLS2Umb5KtZoJVLnmlrM0e1wZMV3WjWcSKx7Y8Xs1WjAtC YzK/8gTpUv0YZIYxf8ngdr7OalPgbhQfYmx3fZss2UcU6I8hs/sCp7lzwxrkt13qqsy5Bm1Fx6/ SSwhz121fWURR/w0PyI6V1yjwORwMGsTPD/cI2uzlglCXAN8VaIUuBF3DHQcEtr1q/WrHM9CFh1 hKZrlz2hJgg7PgOzFRhrXAvS6Yz58zlGH9HJp71vyHEas5iumowF6apN2/BpRTISCBAZ523JH/3 n+oiXmKOftEeZtJ215ZHcKBzmQbHBlHriGuqPEfBjjJKzovW1kITR3NFs8xaatJTrIqGA9wvgg8 AlxgDo= X-Received: by 2002:a05:7022:206:b0:12d:b28e:75b1 with SMTP id a92af1059eb24-1365fb40523mr2838281c88.22.1779529458842; Sat, 23 May 2026 02:44:18 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:18 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 22/31] cxl + dax: Release dax_resources on DCD Release Capacity events Date: Sat, 23 May 2026 02:43:16 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Implement the release path that mirrors the add path: when the device asks for capacity back, the dax layer tears down the per-extent resources for the whole tag group atomically. If any extent in the group is still mapped by a dev_dax, the release is refused with -EBUSY and no state changes; the cxl side then leaves the tag group intact and the device retries. Also add a rollback to the add path: if any per-extent registration fails midway through a group, undo the ones already added so a partial group never leaks into the dax region. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: split out from the original "Surface dc_extents" commit; fills in the RELEASE half of the bridge, moves the cxl-side RELEASE notify into this commit, and adds the rollback path to ADD.] --- drivers/cxl/core/extent.c | 13 +++++++++ drivers/dax/bus.c | 59 +++++++++++++++++++++++++++++++++++++++ drivers/dax/cxl.c | 54 +++++++++++++++++++++++++++-------- drivers/dax/dax-private.h | 8 ++++-- 4 files changed, 120 insertions(+), 14 deletions(-) diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c index 3fc4b7292664..2c8edfe53c0a 100644 --- a/drivers/cxl/core/extent.c +++ b/drivers/cxl/core/extent.c @@ -532,6 +532,7 @@ int cxl_rm_extent(struct cxl_memdev_state *mds, struct = cxl_extent *extent) struct range dpa_range; unsigned long idx; uuid_t tag; + int rc; =20 dpa_range =3D (struct range) { .start =3D start_dpa, @@ -588,6 +589,18 @@ int cxl_rm_extent(struct cxl_memdev_state *mds, struct= cxl_extent *extent) return -EINVAL; } =20 + rc =3D cxlr_notify_extent(cxlr, DCD_RELEASE_CAPACITY, group); + if (rc) { + /* + * dax layer refused (-EBUSY) or failed (-ENOMEM, etc.). Do + * not proceed to tear down the tag group =E2=80=94 leave its + * dax_resources alive so we do not free them out from under + * live dev_dax ranges. The device will retry the release. + */ + return 0; + } + + /* Release the entire tag group */ rm_tag_group(group); return 0; } diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index a6ee59f2d8a1..6368bdfdf93a 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -253,6 +253,65 @@ int dax_region_add_resource(struct dax_region *dax_reg= ion, } EXPORT_SYMBOL_GPL(dax_region_add_resource); =20 +int dax_region_rm_resource(struct dax_region *dax_region, + struct device *dev) +{ + struct dax_resource *dax_resource; + + guard(rwsem_write)(&dax_region_rwsem); + + dax_resource =3D dev_get_drvdata(dev); + if (!dax_resource) + return 0; + + if (dax_resource->use_cnt) + return -EBUSY; + + /* + * release the resource under dax_region_rwsem to avoid races with + * users trying to use the extent + */ + __dax_release_resource(dax_resource); + dev_set_drvdata(dev, NULL); + return 0; +} +EXPORT_SYMBOL_GPL(dax_region_rm_resource); + +/** + * dax_region_rm_resources - atomically remove a set of dax_resources. + * + * Walk @devs twice under dax_region_rwsem. First pass refuses the + * operation if any member's use_cnt is non-zero; second pass releases + * each. This gives refuse-all-or-none semantics across the set, which + * a tag group's atomic release relies on. Devices with no + * dax_resource attached are silently skipped. + */ +int dax_region_rm_resources(struct dax_region *dax_region, + struct device * const *devs, unsigned int n) +{ + unsigned int i; + + guard(rwsem_write)(&dax_region_rwsem); + + for (i =3D 0; i < n; i++) { + struct dax_resource *r =3D dev_get_drvdata(devs[i]); + + if (r && r->use_cnt) + return -EBUSY; + } + + for (i =3D 0; i < n; i++) { + struct dax_resource *r =3D dev_get_drvdata(devs[i]); + + if (!r) + continue; + __dax_release_resource(r); + dev_set_drvdata(devs[i], NULL); + } + return 0; +} +EXPORT_SYMBOL_GPL(dax_region_rm_resources); + bool static_dev_dax(struct dev_dax *dev_dax) { return is_static(dev_dax->region); diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 690cf625e052..04b73315a8f2 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -44,19 +44,52 @@ static int cxl_dax_group_add(struct dax_region *dax_reg= ion, =20 xa_for_each(&group->dc_extents, index, dc_extent) { rc =3D __cxl_dax_add_resource(dax_region, dc_extent); - if (rc) + if (rc) { + /* + * Unwind every dax_resource already added for this + * group; one rm per owner suffices. + */ + struct dc_extent *u; + unsigned long uidx; + + xa_for_each(&group->dc_extents, uidx, u) { + if (u =3D=3D dc_extent) + break; + dax_region_rm_resource(dax_region, &u->dev); + } return rc; + } } return 0; } =20 -/* - * RELEASE is still a stub here =E2=80=94 the atomic dax_region_rm_resourc= es API - * and its wire-up land in the next commit. An incoming RELEASE returns - * success and the cxl side proceeds to rm_tag_group(), which device- - * unregisters each dc_extent; the devm action armed by - * dax_region_add_resource() then tears down each dax_resource. - */ +static int cxl_dax_group_rm(struct dax_region *dax_region, + struct cxl_dc_tag_group *group) +{ + struct dc_extent *dc_extent; + struct device **devs; + unsigned long index; + unsigned int n =3D 0; + int rc; + + if (!group->nr_extents) + return 0; + + devs =3D kmalloc_array(group->nr_extents, sizeof(*devs), GFP_KERNEL); + if (!devs) + return -ENOMEM; + + xa_for_each(&group->dc_extents, index, dc_extent) { + if (n =3D=3D group->nr_extents) + break; + devs[n++] =3D &dc_extent->dev; + } + + rc =3D dax_region_rm_resources(dax_region, devs, n); + kfree(devs); + return rc; +} + static int cxl_dax_region_notify(struct device *dev, struct cxl_notify_data *notify_data) { @@ -68,10 +101,7 @@ static int cxl_dax_region_notify(struct device *dev, case DCD_ADD_CAPACITY: return cxl_dax_group_add(dax_region, group); case DCD_RELEASE_CAPACITY: - dev_dbg(&cxlr_dax->dev, - "DCD RELEASE notify (tag %pUb): no-op (stub)\n", - &group->uuid); - return 0; + return cxl_dax_group_rm(dax_region, group); case DCD_FORCED_CAPACITY_RELEASE: default: dev_err(&cxlr_dax->dev, "Unknown DC event %d\n", diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index f2ae5918f94d..414813a6137f 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -146,13 +146,17 @@ struct dax_resource { }; =20 /* - * Similar to run_dax() dax_region_add_resource() is exported but is not - * intended to be a generic operation outside the dax subsystem. It is on= ly + * Similar to run_dax() dax_region_{add,rm}_resource() are exported but ar= e not + * intended to be generic operations outside the dax subsystem. They are = only * generic between the dax layer and the dax drivers. */ int dax_region_add_resource(struct dax_region *dax_region, struct device *= dev, resource_size_t start, resource_size_t length, const uuid_t *tag, u16 seq_num); +int dax_region_rm_resource(struct dax_region *dax_region, + struct device *dev); +int dax_region_rm_resources(struct dax_region *dax_region, + struct device * const *devs, unsigned int n); =20 static inline struct dev_dax *to_dev_dax(struct device *dev) { --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f43.google.com (mail-dl1-f43.google.com [74.125.82.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97A6238C2BF for ; Sat, 23 May 2026 09:44:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529463; cv=none; b=dV1PxwrPDMNweqAnk4gL0lVgpObSqDgc5uDieyqiEbZ9pFPVFPRM6D1jitq0DGIDXHxj9X4L7Yn4i22U6Wf+qFVozi1UJfTZT4ANvOgG0epvWTX9aEvQFGT7hDCYT/gBNnrMI0e6pbW0mo/Jlfu7kLHiwHe0Qo7P3mC8Q7lqH/s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529463; c=relaxed/simple; bh=8IuA4Maawk4JmdW1e3XdQBnbX3d39FsPRnY5vEwRHwA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=goTzDwq208Umxo9aGGDRBKFnPkeogwnK1kCiyJ1JTbiyTT2Cw+IEf9L1wZ+EJtyrYH09RayDqlVTt5mKEyf7/345FNDTKU2X0Mj5r74Nx5CBbkyRoA97UvmfxjMHuAvF+XNOot0ndyL/dWBV3Dq2NRIt61vMSct6xcW7HwNpeq8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bvgpsEjh; arc=none smtp.client-ip=74.125.82.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bvgpsEjh" Received: by mail-dl1-f43.google.com with SMTP id a92af1059eb24-135e7f4a295so2841138c88.0 for ; Sat, 23 May 2026 02:44:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529461; x=1780134261; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fdYPrQ2V0SsFL9O2oIVedkPq9+Ea1ePouOP0UuVSLgA=; b=bvgpsEjhVR4o15s8rTpxVsKpo5hIHxpQcinL66XyGfZoL5n4yyxgoHmgwuNH6dWwPA g5BMmPge2j0R+Q01tdBNmqzYPY9gXtqSZPBcZvmo+sstSqtpsW+dfuZysvDl4aeNyDyd m0dX7IKlkIsguPaLtTCi8gD88wV3unc8hUYpTUuNLb4EW/Ef4LBSA/IzZBccWVghh0VK 6yS66CpEfV3eK/657IqrQqVmACxeruFZ667jhbWLOo7CWEb1Ovw6CUJ1gflVwRbNNyYm cjZQ1B2bsFAa5aGLjpwvVVisWRoQ0H3WYWW8nPHqTgGr0ONGmCq5u8Pphyni+Rw51348 ch/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529461; x=1780134261; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=fdYPrQ2V0SsFL9O2oIVedkPq9+Ea1ePouOP0UuVSLgA=; b=DtIK01methYHGhtr1X7M+/cKDyrzVcXJl9oR8Jh6eBNFym9/VK5p81fqWB+HboCAbU gujBOvku9EyRFbM0xh6CtOpFd9bP9S/vRt/O3Tq9Fq0b6jHFRzDFwT1+vL8BRC1moB5h DKRMhdTPG5nRv6Xd1Sp4z0+TGe7GaBsHGkjR4HIAQEloR3tCLFhAU78LVV7pBar1RcK5 m1vc+O2YiBI1AGuRiNUR0XWlzsYKl62dNfm12M5fnZ2qPjDTMFpNAqUaCK66/8xONenC fh/ddmfTmuGdQ30YlTcB+40vvBA9HM0S8/fco9xdrlWQWszvjD0DCbNSvbz1z/0NIu+8 swew== X-Forwarded-Encrypted: i=1; AFNElJ/IFbJ/Wc1CDz7GTC84X22/f7sfmNGjWhx/5doj9vM+ercPj5A9oTnKYAqnPUb9k9q3WHkJgLgj0jYvTL8=@vger.kernel.org X-Gm-Message-State: AOJu0YzVYPQEdn6HAThRMfFGyT5T2Z/01R0FI9vi057ybSpb32Qj9kiD MrcnZgKgmn56TIim5k7bwoA0dfER9IT/IAiO5sIPOxQsebVeoM/fWWOj X-Gm-Gg: Acq92OFyPxnP1/CqhBK0Si5tMYh2bBd63wA2UzMpJ3mO/0y2wUvsJEfDjfEqBYIjoCc dosX727AlXQgWpkLrafHzrP3f2crmIQdOd2kLamId+iliy++Ztf4FsxeYedVAlTEDqmzUJpNI1E yusdA5N4XNKQ9komIsvO/qIKW2I8ZHdkQCqpRxykcRoXIob8ldlVOP+0G0R8G1U5I/7IRmzL27N faDzz60Imqj1QGDIgBF5O31G6JNl+fFvXQlMRolXNLrhTlr0r3S+xs3P0KMJddlJbYfF1TGDSkw U2XM6IpeTRNJwZqo6fYeyREfkm5KkFkug6eqjFBq2qO0T+A4v4lWB3IymkVIxSN9H94EtGIVkwv e2fhuffzBmz98JpnVNr7Rrk3jDXBs9FKQkXwuBNAb94jKlrnKoQLgaSfk2zo3e6oNwLXsYQW8k9 SkD9X0KvA/JbYsAcEzY9UT1fNQaM+NQamfkEa8E8i8qS62EJ2teXajJtUUQ2SlYzO9ubfepDMRX vrWVH8= X-Received: by 2002:a05:701b:4281:20b0:134:fc38:5d2f with SMTP id a92af1059eb24-136616f4055mr1707179c88.21.1779529460723; Sat, 23 May 2026 02:44:20 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:20 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron Subject: [PATCH v10 23/31] dax/bus: Factor out dev dax resize logic Date: Sat, 23 May 2026 02:43:17 -0700 Message-ID: <29393afa419cdffdd5d299cdc323262f5c20c036.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Ira Weiny Dynamic Capacity (DC) DAX regions back their dax devices with per-extent resource children of the region, rather than carving from a single contiguous dax_region->res. Allocating space for a DC dax device =E2=80=94= on initial uuid claim of its backing extents and on shrink-to-0 during destroy =E2=80=94 needs the same allocator the static case uses, but pointe= d at a different parent resource. Factor the body of dev_dax_resize() into __dev_dax_resize(parent, ...) and add a dev_dax_resize_static() wrapper that passes dax_region->res for static (non-DC) regions. alloc_dev_dax_range() gains the same parent parameter so it can operate under either kind of parent. No functional change. Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang Signed-off-by: Ira Weiny --- Changes: [anisa: reword to drop the options-considered discussion and "sparse" terminology; preserved in a later commit that realizes per-extent resource children] --- drivers/dax/bus.c | 131 ++++++++++++++++++++++++++++------------------ 1 file changed, 81 insertions(+), 50 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 6368bdfdf93a..5c1b93890d30 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1012,11 +1012,10 @@ static int devm_register_dax_mapping(struct dev_dax= *dev_dax, int range_id) return 0; } =20 -static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start, - resource_size_t size, struct dax_resource *dax_resource) +static int alloc_dev_dax_range(struct resource *parent, struct dev_dax *de= v_dax, + u64 start, resource_size_t size, + struct dax_resource *dax_resource) { - struct dax_region *dax_region =3D dev_dax->region; - struct resource *res =3D &dax_region->res; struct device *dev =3D &dev_dax->dev; struct dev_dax_range *ranges; unsigned long pgoff =3D 0; @@ -1034,14 +1033,14 @@ static int alloc_dev_dax_range(struct dev_dax *dev_= dax, u64 start, return 0; } =20 - alloc =3D __request_region(res, start, size, dev_name(dev), 0); + alloc =3D __request_region(parent, start, size, dev_name(dev), 0); if (!alloc) return -ENOMEM; =20 ranges =3D krealloc(dev_dax->ranges, sizeof(*ranges) * (dev_dax->nr_range + 1), GFP_KERNEL); if (!ranges) { - __release_region(res, alloc->start, resource_size(alloc)); + __release_region(parent, alloc->start, resource_size(alloc)); return -ENOMEM; } =20 @@ -1195,50 +1194,45 @@ static bool adjust_ok(struct dev_dax *dev_dax, stru= ct resource *res) return true; } =20 -static ssize_t dev_dax_resize(struct dax_region *dax_region, - struct dev_dax *dev_dax, resource_size_t size) +/** + * dev_dax_resize_static - Expand the device into the unused portion of the + * region. This may involve adjusting the end of an existing resource, or + * allocating a new resource. + * + * @parent: parent resource to allocate this range in + * @dev_dax: DAX device to be expanded + * @to_alloc: amount of space to alloc; must be <=3D space available in @p= arent + * + * Return the amount of space allocated or -ERRNO on failure + */ +static ssize_t dev_dax_resize_static(struct resource *parent, + struct dev_dax *dev_dax, + resource_size_t to_alloc) { - resource_size_t avail =3D dax_region_avail_size(dax_region), to_alloc; - resource_size_t dev_size =3D dev_dax_size(dev_dax); - struct resource *region_res =3D &dax_region->res; - struct device *dev =3D &dev_dax->dev; struct resource *res, *first; - resource_size_t alloc =3D 0; int rc; =20 - if (dev->driver) - return -EBUSY; - if (size =3D=3D dev_size) - return 0; - if (size > dev_size && size - dev_size > avail) - return -ENOSPC; - if (size < dev_size) - return dev_dax_shrink(dev_dax, size); - - to_alloc =3D size - dev_size; - if (dev_WARN_ONCE(dev, !alloc_is_aligned(dev_dax, to_alloc), - "resize of %pa misaligned\n", &to_alloc)) - return -ENXIO; - - /* - * Expand the device into the unused portion of the region. This - * may involve adjusting the end of an existing resource, or - * allocating a new resource. - */ -retry: - first =3D region_res->child; - if (!first) - return alloc_dev_dax_range(dev_dax, dax_region->res.start, to_alloc, NUL= L); + first =3D parent->child; + if (!first) { + rc =3D alloc_dev_dax_range(parent, dev_dax, + parent->start, to_alloc, NULL); + if (rc) + return rc; + return to_alloc; + } =20 - rc =3D -ENOSPC; for (res =3D first; res; res =3D res->sibling) { struct resource *next =3D res->sibling; + resource_size_t alloc; =20 /* space at the beginning of the region */ - if (res =3D=3D first && res->start > dax_region->res.start) { - alloc =3D min(res->start - dax_region->res.start, to_alloc); - rc =3D alloc_dev_dax_range(dev_dax, dax_region->res.start, alloc, NULL); - break; + if (res =3D=3D first && res->start > parent->start) { + alloc =3D min(res->start - parent->start, to_alloc); + rc =3D alloc_dev_dax_range(parent, dev_dax, + parent->start, alloc, NULL); + if (rc) + return rc; + return alloc; } =20 alloc =3D 0; @@ -1247,21 +1241,56 @@ static ssize_t dev_dax_resize(struct dax_region *da= x_region, alloc =3D min(next->start - (res->end + 1), to_alloc); =20 /* space at the end of the region */ - if (!alloc && !next && res->end < region_res->end) - alloc =3D min(region_res->end - res->end, to_alloc); + if (!alloc && !next && res->end < parent->end) + alloc =3D min(parent->end - res->end, to_alloc); =20 if (!alloc) continue; =20 if (adjust_ok(dev_dax, res)) { rc =3D adjust_dev_dax_range(dev_dax, res, resource_size(res) + alloc); - break; + if (rc) + return rc; + return alloc; } - rc =3D alloc_dev_dax_range(dev_dax, res->end + 1, alloc, NULL); - break; + rc =3D alloc_dev_dax_range(parent, dev_dax, res->end + 1, alloc, NULL); + if (rc) + return rc; + return alloc; } - if (rc) - return rc; + + /* available was already calculated and should never be an issue */ + dev_WARN_ONCE(&dev_dax->dev, 1, "space not found?"); + return 0; +} + +static ssize_t dev_dax_resize(struct dax_region *dax_region, + struct dev_dax *dev_dax, resource_size_t size) +{ + resource_size_t avail =3D dax_region_avail_size(dax_region); + resource_size_t dev_size =3D dev_dax_size(dev_dax); + struct device *dev =3D &dev_dax->dev; + resource_size_t to_alloc; + resource_size_t alloc; + + if (dev->driver) + return -EBUSY; + if (size =3D=3D dev_size) + return 0; + if (size > dev_size && size - dev_size > avail) + return -ENOSPC; + if (size < dev_size) + return dev_dax_shrink(dev_dax, size); + + to_alloc =3D size - dev_size; + if (dev_WARN_ONCE(dev, !alloc_is_aligned(dev_dax, to_alloc), + "resize of %pa misaligned\n", &to_alloc)) + return -ENXIO; + +retry: + alloc =3D dev_dax_resize_static(&dax_region->res, dev_dax, to_alloc); + if (alloc <=3D 0) + return alloc; to_alloc -=3D alloc; if (to_alloc) goto retry; @@ -1367,7 +1396,8 @@ static ssize_t mapping_store(struct device *dev, stru= ct device_attribute *attr, =20 to_alloc =3D range_len(&r); if (alloc_is_aligned(dev_dax, to_alloc)) - rc =3D alloc_dev_dax_range(dev_dax, r.start, to_alloc, NULL); + rc =3D alloc_dev_dax_range(&dax_region->res, dev_dax, r.start, + to_alloc, NULL); up_write(&dax_dev_rwsem); up_write(&dax_region_rwsem); =20 @@ -1659,7 +1689,8 @@ static struct dev_dax *__devm_create_dev_dax(struct d= ev_dax_data *data) device_initialize(dev); dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id); =20 - rc =3D alloc_dev_dax_range(dev_dax, dax_region->res.start, data->size, NU= LL); + rc =3D alloc_dev_dax_range(&dax_region->res, dev_dax, dax_region->res.sta= rt, + data->size, NULL); if (rc) goto err_range; =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f43.google.com (mail-dl1-f43.google.com [74.125.82.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F89138E8B4 for ; Sat, 23 May 2026 09:44:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529464; cv=none; b=gqcZldWkZAfOGhTup49/mtM7awpL6iW1UrcCHEzuBxYIgP3ZEfz0XjHCZCBDhEl4sA1rXqegcKQadsAAX061DOQ/V8wgqYN1BnsY5Tz1RvigC2lU8hcDXwuzouYT8PexZPta47s1o1BZCfIJU42Uq8piZ2UzFL1oyRCqBovAFYM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529464; c=relaxed/simple; bh=9HSZkj8sz7iE2pO/UoR8wmVGm/iBP6zckVKXnScvJhU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=b1vsu/twkMjqXy3a3t3/l2B3+qEsT5yg6TAjx9nyRXug+I2DrNtEGIDDff/+wtk5Rnp5lsocEmTcYHxySGcnOUvya7YyP+xP/Y0W9mqPXHZpftUjTzpMqiEFeX8TkGJU/TtCvqbQle4y8qq3v1kcSw1UFLNbR91kIl92G61a+1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AjiJn0TF; arc=none smtp.client-ip=74.125.82.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AjiJn0TF" Received: by mail-dl1-f43.google.com with SMTP id a92af1059eb24-1363e78746eso3080232c88.1 for ; Sat, 23 May 2026 02:44:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529462; x=1780134262; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cftZ9CN3nyhvsXby1K+8vux5FcbG22mvmTiGFrN7YKo=; b=AjiJn0TFecZk2JmMQTbY5PaLpTRnOyj/k9ePODy/MnIZ/QQXXaJVWShDbm/XLEwE9d TJA/BLTZ6msx342jDEPnSPtjZv0noPExLhPEboYbNv385dBIVHD68Fe/vMRotkqBGfhx DcEDqf55ApbWltuUnaQpqm4G+DcttjQXs7wlnpsdgIh6+71DWhgNLXbmz+D98G/S1TwH ukQFqQN8qH+8mfu62SlFFzs0Ovf65QNMF7FO/hBxMiAUIxbfMNiOVdaJ90Y0KPPV9hcO PVmW3DaDTqJCR/a+6dph2QL58yj+rpa5o/ToNJ1O0eCT9ctVN/vCXIsBM8+ppKGZznIw MBUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529462; x=1780134262; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cftZ9CN3nyhvsXby1K+8vux5FcbG22mvmTiGFrN7YKo=; b=lpAWF3a8R+OjIc81DZRl4laqqb/3loMxG5wXXRHbabwGT0jK9dAPvui3HQ8SbajfTj 5kt81tTjwJU8qI1cZwskoYBq0kDqOcun0BIHSOs5fbUxs9hGjxYSRXigf6LxI4tuXPDT idNHYdnJGpDuYQx3I+RN5/gI2DcAMP5/o3u/zP5sh/rvE4bdT2AibRgIfp8/AOF2MVkd 1ksoehhdGCAVDzjnZfB/P/DIC5vlXxZ+bwTAA5iHiOf5do/ORyL+4CDQeKqk3A6/iieE Xz2yaO91rjkK5hrjw5r5wUZicZxIuYt8NYmUbk88MBUdzbs8aqLaG1KzqbSchdh2/DPm WBsA== X-Forwarded-Encrypted: i=1; AFNElJ+tQlqoxecBNfCcT5LVNqVmvw8o/elf89TS/r8JEA3z81MNTCgyMuOiv4lV8eMx5AyVqj3mN678GDd8s8w=@vger.kernel.org X-Gm-Message-State: AOJu0YwGC1a4fud56iq8u0OvV9GF3k6O4Ly/iE+c+cyyWKtFTEF4aBgy Tg/0sO2caNoNJzWTicmzJzi4eMDhdtx7uPJNjhCSGN11ZbRiWHZMX7Vq X-Gm-Gg: Acq92OFLhI+BkkjEHX35WOqq9bjieBdu3w8AbLOsOKaxiZs0IKnEU4Z3Abf7Aq4mNNP 1my3F0JWazCL6OHgPPgQA/vtbS1hYQ7Qdis8J6b9tEROZ7u4+Yl/5a+YFb7S6QWz4tQ7hom4UlY cH7smuyqRP9XD90cR2fz8EZIDS75seebNTL6IzaWE14ykwQ0TSv2QMyEdJvlcMjtFFob7uC8gDD /fkZJgvimOZ+XBpcV8RGnDers7kZ6isFgmX+s3cb1LCupUqrTcHXLoKP2eADs+mazcJe2YdHtRq y8/cyTOqL0xRnZqethXGyRh4VJU8/hxit76dsSOekQ9tmKvP+qBEcQAe+BeZKlsaKiofwKCO+IU KTjGRqz4OAVYgH2zKvusqpZlQ3J2VYpKVDik4gkVQjp2e4CIBN2Ox+azddCIaBn/79oEyCUlwxV XmkPpvB4J82/35Aj7mNVWsMNZR6mz8RdfkPMFZVwwrQDOOQ0prnPP7P5PW+6TqIgvAi/RbbQniR +o2aLI= X-Received: by 2002:a05:7022:fe09:b0:134:ff2e:a71c with SMTP id a92af1059eb24-1365f600b14mr2361648c88.9.1779529462407; Sat, 23 May 2026 02:44:22 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:21 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su Subject: [PATCH v10 24/31] dax/bus: Add uuid sysfs attribute to dax devices Date: Sat, 23 May 2026 02:43:18 -0700 Message-ID: <00e5da991afc1c96ca1074152ec10d0d8484b673.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Introduce a read-write 'uuid' sysfs entry at /sys/bus/dax/devices/daxX.Y/ with stubbed handlers: show returns "0" and store returns -EOPNOTSUPP. A follow-on patch wires both directions to dax_resource tracking. Document the attribute in the dax sysfs ABI. Signed-off-by: Anisa Su --- Documentation/ABI/testing/sysfs-bus-dax | 18 ++++++++++++++++++ drivers/dax/bus.c | 14 ++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/te= sting/sysfs-bus-dax index b34266bfae49..23400824073b 100644 --- a/Documentation/ABI/testing/sysfs-bus-dax +++ b/Documentation/ABI/testing/sysfs-bus-dax @@ -59,6 +59,24 @@ Description: backing device for this dax device, emit the CPU node affinity for this device. =20 +What: /sys/bus/dax/devices/daxX.Y/uuid +Date: May, 2026 +KernelVersion: v6.16 +Contact: nvdimm@lists.linux.dev +Description: + (RW) On read, reports the uuid identifying the capacity + backing this dax device. A value of "0" indicates that the + device has no associated uuid =E2=80=94 either it is not backed by + DCD capacity, or the backing extents are untagged. + + Writes are accepted only on dax devices in sparse (DCD) + regions; writes to non-sparse devices return -EOPNOTSUPP. + Writing a non-null uuid claims every dax_resource in the + parent region whose tag matches the written uuid, consuming + any available capacity in each matching resource. Writing + "0" is shorthand for the null uuid and claims a single + untagged dax_resource. + What: /sys/bus/dax/devices/daxX.Y/target_node Date: February, 2019 KernelVersion: v5.1 diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 5c1b93890d30..1d6f82920be6 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1526,6 +1526,19 @@ static ssize_t numa_node_show(struct device *dev, } static DEVICE_ATTR_RO(numa_node); =20 +static ssize_t uuid_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", 0); +} + +static ssize_t uuid_store(struct device *dev, struct device_attribute *att= r, + const char *buf, size_t len) +{ + return -EOPNOTSUPP; +} +static DEVICE_ATTR_RW(uuid); + static ssize_t memmap_on_memory_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -1597,6 +1610,7 @@ static struct attribute *dev_dax_attributes[] =3D { &dev_attr_resource.attr, &dev_attr_numa_node.attr, &dev_attr_memmap_on_memory.attr, + &dev_attr_uuid.attr, NULL, }; =20 --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f50.google.com (mail-dl1-f50.google.com [74.125.82.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC3F138F939 for ; Sat, 23 May 2026 09:44:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529468; cv=none; b=sZfkVNYlTboZZB7zVH90jb7M3fhepA6027KAv9aa0Hle5mcJPi0uTxjx2r5NnLruh3/QV1Cl2Lo4RYD24wJK2sa1q9LNG8E0/wnNOmheVMRwJfjX9C+ZY4IhZJBFhsgHrB57gzrE7X7A8grOtm6lhvr6siCtlYyk184bLLk/GWM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529468; c=relaxed/simple; bh=3HXDv3onY23MfZ6DmFqJ7fSkS0iWKYaMT2eOTFDeoDQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qtlmFrNJ7nknIdN/ThjodHs6yIkZcxneOdKjBop4Vq1O/Uz7lOpLuezkaN/v7xAbDC4AWpVnCQbjFz9UFNEpiRE65/9iWbICTx1Y1ZLXNzFsqXjx3HaK1bBeXa8dW9jnURZsxsp0WjCs01fr8bP+wM3o6VT5RG6wzDEyLFNPrqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=sICBeBC2; arc=none smtp.client-ip=74.125.82.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sICBeBC2" Received: by mail-dl1-f50.google.com with SMTP id a92af1059eb24-1363e78746eso3080243c88.1 for ; Sat, 23 May 2026 02:44:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529465; x=1780134265; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o82VrbrSunpqYPkFfmmxbOQmk9JmN+rFJu6lm18iL6E=; b=sICBeBC21q7VsRME2/b8IBqFaDn7Yfo+tOSniIxM5PPNlwQ1cRFRSmxQmhCVpeYS3S TTCt4X1y2fU7DDLuPD2Hxx4S3ytf58DNaXNEh7qY08EQCYrz34DF2U/iE0cz+teLtZKS +9SyRwJFI0ZLo1ciiaDPYSd2kzczaa8kNppZK8QWBXqGCq+eoeTQ6FjIyc6giSVcJB3F GNViYvEWJH2Rt5Cg2QQE0Q6Nf+VsAdLQ11LJNSMNTZDLBMo5fHZxlGvfuWlJ6iht/O8J HrY4OMGph/qaOjBHktsMfxryG4Pw0XuniHKmllBjEGp7Q6GztrkEwJz2OogAz4HDChvb CU+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529465; x=1780134265; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=o82VrbrSunpqYPkFfmmxbOQmk9JmN+rFJu6lm18iL6E=; b=NM+soheyl2HtrDVWsAXmlnF5cwT6DVurOvOFt3dUFOxsmMjtKX6a7zF/1s/B41NJFU RTm6rJoW2mskuRtXzJPnvul3MTYhjxUp/3+vMvnDHmt7kOj7eivS4Ioqw8SvC+cNcI4t +F/YCq46tz8FKxyWSPTohCmYIBXVdt3ElrGCvjXmqZAcr9/eeEyu39PPfPfXUoeEvCEz juv9Mq720OgjdytGZJ3frP3DhP/NbM3SMuVXKF/eNYbWDiZ5x3ZZgMePeEhAL/q1a2KN FjDslrvNM6ApBu4cmPZFwolA4vvAU+t3jehAQNLwev88qAiYT8FwhpX2diJwRQliXimG 8PNw== X-Forwarded-Encrypted: i=1; AFNElJ/59JQHjWw17BjEU1Ay5U/9f1BHU29mwxDYSY5PUWuVHzjc0QUhuA1q835+DywbHrftmLxrxPojGbCMGdE=@vger.kernel.org X-Gm-Message-State: AOJu0Yzt/8DmWoTaUgIw9QpCaRMe4LrA1m98DACgd3tMw+no5UUhAlF3 dTvadHTMNzUFUIRaQgHXCA5yX6bSO8eWa92Xb0ine+ZdqvFqHR1xpqPi X-Gm-Gg: Acq92OFq2IN+eiZXkfBJkpjBw/iTa73BA8lMiYJd3pR4KZMRGdLU2JfzGYS2FB4ZnSt 1S63eCp3LCvyjgaksCHKEWtfTpgk9RCUOi0rTRU6GKCn/RPqusiA2PHuCzzVAMWwOnx+tBUxldb +yLcOUt5WLJyPZPwNYcJZYUvcvColAyrZjaNwixJBv9bX455Esfv0CgJ71hloQRscP0HGexJkPL HPlfGE6H7/y+KhVOjYGP/JgoJcWjaliEu/i6irCJ1LPdEGSPZHYxB4okRee5XDCGGOTSNoCr0uo IjpbXCKJ2s+JLBDoXnGuseitjR1CW86y9rCjI4IvZCQN77LpwFWN2x+HvnxMsiGs6jGq1tlz8hO 49mRIUID1w5G8c1Oo4Hh2KrX5ODVXx+IqNG9UMqumFe/CclAFrhXBYIcZHRUDHjAEs8DOK1tjhd 0GHFlEmWD4vn0FyP68CnzqP+VXOT7a4mla5PcGmc7HYgOQmu7sAONFc7R3GOqW58LhkLZ0/S+kz bKVUtE= X-Received: by 2002:a05:7022:b042:20b0:136:73ec:922d with SMTP id a92af1059eb24-13673ec94e7mr1323618c88.36.1779529464768; Sat, 23 May 2026 02:44:24 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:23 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 25/31] dax/bus: Reject resize on DC dax devices and enforce 0-size creation Date: Sat, 23 May 2026 02:43:19 -0700 Message-ID: <9c73377182f19e86e2cc939ddf0184d5d85581f9.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable A DC dax device's size is determined by the extents that back it, not by the user. DCD extents are all-or-nothing, so partial shrink is just as illegal as growing. Enforce that on the size and creation paths: * size_store: any non-zero resize on a DC region returns -EOPNOTSUPP. The sole exception is size=3D0, which daxctl destroy-device writes to return every claimed extent to the region's available pool before the device's name is written to the region's 'delete' attribute. * __devm_create_dev_dax: a DC dax device must be created at size 0. Non-zero data->size on a DC region returns -EINVAL with a clear message. The resize machinery (dev_dax_shrink, adjust_ok, dev_dax_resize_static, dev_dax_resize) learns to walk the right parent =E2=80=94 dax_region->res f= or static regions, the dax_resource->res for DC regions claimed via uuid_store =E2=80=94 so shrink-to-0 correctly releases each extent's child resource rather than the region's. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: split out from the original "Surface dc_extents" commit; DC-aware resize policy only.] --- drivers/dax/bus.c | 46 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 11 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 1d6f82920be6..c030eb103ad0 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1136,7 +1136,8 @@ static int dev_dax_shrink(struct dev_dax *dev_dax, re= source_size_t size) int i; =20 for (i =3D dev_dax->nr_range - 1; i >=3D 0; i--) { - struct range *range =3D &dev_dax->ranges[i].range; + struct dev_dax_range *dev_range =3D &dev_dax->ranges[i]; + struct range *range =3D &dev_range->range; struct dax_mapping *mapping =3D dev_dax->ranges[i].mapping; struct resource *adjust =3D NULL, *res; resource_size_t shrink; @@ -1152,6 +1153,10 @@ static int dev_dax_shrink(struct dev_dax *dev_dax, r= esource_size_t size) continue; } =20 + /* + * Partial shrink: forbidden on DC regions, so dev_range + * here must belong to a static device. + */ for_each_dax_region_resource(dax_region, res) if (strcmp(res->name, dev_name(dev)) =3D=3D 0 && res->start =3D=3D range->start) { @@ -1195,19 +1200,21 @@ static bool adjust_ok(struct dev_dax *dev_dax, stru= ct resource *res) } =20 /** - * dev_dax_resize_static - Expand the device into the unused portion of the - * region. This may involve adjusting the end of an existing resource, or - * allocating a new resource. + * __dev_dax_resize - Expand the device into the unused portion of the reg= ion. + * This may involve adjusting the end of an existing resource, or allocati= ng a + * new resource. * * @parent: parent resource to allocate this range in * @dev_dax: DAX device to be expanded * @to_alloc: amount of space to alloc; must be <=3D space available in @p= arent + * @dax_resource: if dc; the parent resource * * Return the amount of space allocated or -ERRNO on failure */ -static ssize_t dev_dax_resize_static(struct resource *parent, - struct dev_dax *dev_dax, - resource_size_t to_alloc) +static ssize_t __dev_dax_resize(struct resource *parent, + struct dev_dax *dev_dax, + resource_size_t to_alloc, + struct dax_resource *dax_resource) { struct resource *res, *first; int rc; @@ -1215,7 +1222,8 @@ static ssize_t dev_dax_resize_static(struct resource = *parent, first =3D parent->child; if (!first) { rc =3D alloc_dev_dax_range(parent, dev_dax, - parent->start, to_alloc, NULL); + parent->start, to_alloc, + dax_resource); if (rc) return rc; return to_alloc; @@ -1229,7 +1237,8 @@ static ssize_t dev_dax_resize_static(struct resource = *parent, if (res =3D=3D first && res->start > parent->start) { alloc =3D min(res->start - parent->start, to_alloc); rc =3D alloc_dev_dax_range(parent, dev_dax, - parent->start, alloc, NULL); + parent->start, alloc, + dax_resource); if (rc) return rc; return alloc; @@ -1253,7 +1262,8 @@ static ssize_t dev_dax_resize_static(struct resource = *parent, return rc; return alloc; } - rc =3D alloc_dev_dax_range(parent, dev_dax, res->end + 1, alloc, NULL); + rc =3D alloc_dev_dax_range(parent, dev_dax, res->end + 1, alloc, + dax_resource); if (rc) return rc; return alloc; @@ -1264,6 +1274,13 @@ static ssize_t dev_dax_resize_static(struct resource= *parent, return 0; } =20 +static ssize_t dev_dax_resize_static(struct dax_region *dax_region, + struct dev_dax *dev_dax, + resource_size_t to_alloc) +{ + return __dev_dax_resize(&dax_region->res, dev_dax, to_alloc, NULL); +} + static ssize_t dev_dax_resize(struct dax_region *dax_region, struct dev_dax *dev_dax, resource_size_t size) { @@ -1277,6 +1294,8 @@ static ssize_t dev_dax_resize(struct dax_region *dax_= region, return -EBUSY; if (size =3D=3D dev_size) return 0; + if (size !=3D 0 && is_dynamic(dax_region)) + return -EOPNOTSUPP; if (size > dev_size && size - dev_size > avail) return -ENOSPC; if (size < dev_size) @@ -1288,7 +1307,7 @@ static ssize_t dev_dax_resize(struct dax_region *dax_= region, return -ENXIO; =20 retry: - alloc =3D dev_dax_resize_static(&dax_region->res, dev_dax, to_alloc); + alloc =3D dev_dax_resize_static(dax_region, dev_dax, to_alloc); if (alloc <=3D 0) return alloc; to_alloc -=3D alloc; @@ -1674,6 +1693,11 @@ static struct dev_dax *__devm_create_dev_dax(struct = dev_dax_data *data) struct device *dev; int rc; =20 + if (is_dynamic(dax_region) && data->size) { + dev_err(parent, "DC DAX region devices must be created initially with 0 = size"); + return ERR_PTR(-EINVAL); + } + dev_dax =3D kzalloc_obj(*dev_dax); if (!dev_dax) return ERR_PTR(-ENOMEM); --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f52.google.com (mail-dl1-f52.google.com [74.125.82.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F324390C8A for ; Sat, 23 May 2026 09:44:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529471; cv=none; b=og5qRPpMjtB0DSKoGjvSCu93XTb2GBf0Q+11PEGwIXwuqbSQYxXNkk+b5V1KBJ06gl0nKlfNia136FpFeEYZL0F+euWw8w24yDEd489F07XX4SJ56rpqmI8ZxOl9/BN+ft0tnbsmrhc4uIQMNlOj7qOrWqCrH4Rdi4geQqSVos8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529471; c=relaxed/simple; bh=dPc+Jmt6Ao4NKtb3a2yTElMc9JU4ev8Gi4vvYTbxHSM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dVHscxSmca+iALLSgiq8KSawcNonHlhcrKvuBg351ej8aIZDvVwk6BwFSRmrDHoI9PYuXMcXYtVlIQbLAMMe3nsiY5P7M4BVFoGVcjY5Ce3bmRI+5yZIYpGRAorlinJ2Ce03KUFX3P9GzeAgko+ZlY1XbWXLQDbALWTGMTCTuLQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lv53Bdaz; arc=none smtp.client-ip=74.125.82.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lv53Bdaz" Received: by mail-dl1-f52.google.com with SMTP id a92af1059eb24-132830d8281so2577715c88.1 for ; Sat, 23 May 2026 02:44:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529467; x=1780134267; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9mYuq+VCymP+W06eRpvcFDkonJ88MmeA72PafHhyXdc=; b=lv53BdazzmWJT0t9Gz7CtkfcFD1tnH7w9CoaWqttMquusCmKmR4BS35dyo9hWWzOmF 0T+hrbD65ydYtP3m33ErJM+fwVfNhyhPoxRgXnb0zfMI5yjSPlRZDNNHhPNxEa2XZeII Iap+Tmd/yLL1MtnADcpEj2DqomGnOpunexR3wIW3mbIGZKDFcFJaraRasMGMiWitQE5P GHz8h0yTpGjiMYMB5P/RhfLEFGTyl9k7SsO6cBP7zgMNb/bQ0oe/TlPO7tFOvVRjJieh rDI9eSSxqSUci9R8rlmJSgBg/IP1ScNaqqIqVv0T+rhDea1g9oO0klqO39q7ZcxltZn0 IjAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529467; x=1780134267; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=9mYuq+VCymP+W06eRpvcFDkonJ88MmeA72PafHhyXdc=; b=SzQieAaYnhxOJVFk0748ixCMvrhqyPuhujdFUxtp+qIFInVWw31/kC1Lgbh70WD1Ol lQFkoB3m5CLuRqsAH/gcYe6pK60f1g2sU3fX43Z7ZYow9Us+w1v87hxMp5IAiNgcgBZn xOEWFunt5ibptBK0C7V25b8sdQypns0Td3YqhtTqIV4QjhSSTp7T00jXJIkjmiTF7H15 7+0uodpAdCB7852j1jaiApQiNpptvhUSXHyf85YFz95lMR94dcfo46rjpElAUSfrWoah XXMc0odSRqZowmjB54kemy76Pp5V5AYjOnPPbg98fMVTb4HAQuQRsQAVQn6YaWqN4ZFl LUxQ== X-Forwarded-Encrypted: i=1; AFNElJ/Eeerhc+EA1xpAMXBcPFZIqmbj/ZZ5mYbPDFbHnlTZ5+ZAnXWBzsAvfUVMq/InWNMJQw5x9l3e3GTLQbM=@vger.kernel.org X-Gm-Message-State: AOJu0YwE4POtBXMbACidD9w/Hxsgh8xdMnLGPehkRx+uhHlMdodelFAV zNh0InK+eQ/x5VW4QPgZsn78Tt2eOrgNFteci/W1Gb/1HycMzCfyuv8f X-Gm-Gg: Acq92OH+8QgjxCTQle2px5AnVQrmFvYFlqMOqABYjrqPgdGVPoCio8Nu3YjJgu4MxoA ASjDzxAnR/V8lyp0gRUvORVjeR+nlfUIV2463rcT8Rxmo1zgp6BiatzuxtDVAyeF91hzhshYIkG WSMfT8MIbmeQaMlTpzZ0Za5K47dK7eBPUNvEf1YrtaAqfCd99Plb5vP6Y1vWowLmY/jvAgZfoQk 6bU4Wum3TmHQL+wV74+5KRm5Aoz0h749gCP4yqxGY5hwN9tmtcIkR2dyu0Tmf0Isvbhz4T+DzIn yEjhwQaEXUCAhushx1Azf7vSApIkfvb8I29pNKv8avCfi5WiGTAD5GV09pDPk/3m0sGYd42rZnl DyXJKdq7Gb+8hvvJ//K14+sn1tjrEWDIQjJPxAh4CjJ4UGbHJdKCzdKcctV+Ew/rsl+JgRPTHwP zbKAOcfdUMHHDBy/FLeH/34o0kB9M/92lqKyTwYgm8L7VALReOvSn5E6+TZRdBi7Ar6jeXiXmtm k4NPK0= X-Received: by 2002:a05:7022:68a3:b0:130:9b78:b18d with SMTP id a92af1059eb24-1365fc62681mr2472043c88.34.1779529467174; Sat, 23 May 2026 02:44:27 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:26 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su , Ira Weiny Subject: [PATCH v10 26/31] dax/bus: Tag-aware uuid claim and show on DC dax devices Date: Sat, 23 May 2026 02:43:20 -0700 Message-ID: <89784b600e4284772c4136b462e948e016129cdf.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable DC DAX regions are populated with dax_resource children that each carry a backing tag uuid and a per-allocation sequence number (seq_num). Add the userspace claim semantics that resolve those tagged groups into DAX devices. A DC region's seed dax device is created at 0-size on probe; userspace populates it by writing to its 'uuid' attribute: * A non-null UUID claims every dax_resource on this region whose tag matches, in seq_num order via uuid_claim_tagged(). The match set must form a dense 1..n sequence (no gap, no duplicate); the CXL side maintains this invariant for both sharable allocations (where the device stamps shared_extn_seq) and non-sharable allocations (where cxl_add_pending assigns arrival-order seq). The resulting DAX device's size equals the sum of every member extent's size. * "0" claims a single untagged dax_resource via uuid_claim_untagged(). Untagged extents are independent allocations; collapsing several would aggregate unrelated capacity, so each uuid=3D"0" write consumes exactly one untagged resource. * A write that matches no dax_resource returns -ENOENT; the device stays at size 0. uuid_show() reads back the backing tag uuid (or the null UUID for an untagged claim). The attribute is read-only (0444) on non-DC dax devices; writes to it on non-DC regions return -EOPNOTSUPP. dev_dax_visible() exposes the uuid attribute only on DC dax devices. Based on an original patch by Navneet Singh. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: split out from the original "Surface dc_extents" commit; userspace tag-claim semantics only.] --- drivers/dax/bus.c | 260 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 256 insertions(+), 4 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index c030eb103ad0..1dccb3e5cd0f 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include "dax-private.h" @@ -1316,6 +1317,89 @@ static ssize_t dev_dax_resize(struct dax_region *dax= _region, return 0; } =20 +/* DC extents are all-or-nothing: an extent is either free or fully claime= d. */ +static bool dax_resource_in_use(const struct dax_resource *dax_resource) +{ + return dax_resource->use_cnt > 0; +} + +struct dax_uuid_match { + const struct dax_region *dax_region; + const uuid_t *uuid; +}; + +static int find_uuid_extent(struct device *dev, const void *data) +{ + const struct dax_uuid_match *match =3D data; + struct dax_resource *dax_resource; + + if (!match->dax_region->dc_ops->is_extent(dev)) + return 0; + + dax_resource =3D dev_get_drvdata(dev); + if (!dax_resource || dax_resource_in_use(dax_resource)) + return 0; + return uuid_equal(&dax_resource->uuid, match->uuid); +} + +struct dax_tag_collect { + const struct dax_region *dax_region; + const uuid_t *uuid; + struct dax_resource **arr; + unsigned int count; + unsigned int cap; +}; + +static int collect_uuid_extent(struct device *dev, void *data) +{ + struct dax_tag_collect *c =3D data; + struct dax_resource *dax_resource; + + if (!c->dax_region->dc_ops->is_extent(dev)) + return 0; + + dax_resource =3D dev_get_drvdata(dev); + if (!dax_resource || dax_resource_in_use(dax_resource)) + return 0; + if (!uuid_equal(&dax_resource->uuid, c->uuid)) + return 0; + + if (c->count =3D=3D c->cap) + return -ENOSPC; + c->arr[c->count++] =3D dax_resource; + return 0; +} + +static int count_uuid_extent(struct device *dev, void *data) +{ + struct dax_tag_collect *c =3D data; + struct dax_resource *dax_resource; + + if (!c->dax_region->dc_ops->is_extent(dev)) + return 0; + + dax_resource =3D dev_get_drvdata(dev); + if (!dax_resource || dax_resource_in_use(dax_resource)) + return 0; + if (!uuid_equal(&dax_resource->uuid, c->uuid)) + return 0; + + c->count++; + return 0; +} + +static int dax_resource_seq_cmp(const void *a, const void *b) +{ + const struct dax_resource * const *pa =3D a; + const struct dax_resource * const *pb =3D b; + + if ((*pa)->seq_num < (*pb)->seq_num) + return -1; + if ((*pa)->seq_num > (*pb)->seq_num) + return 1; + return 0; +} + static ssize_t size_store(struct device *dev, struct device_attribute *att= r, const char *buf, size_t len) { @@ -1548,13 +1632,177 @@ static DEVICE_ATTR_RO(numa_node); static ssize_t uuid_show(struct device *dev, struct device_attribute *attr, char *buf) { - return sysfs_emit(buf, "%d\n", 0); + struct dev_dax *dev_dax =3D to_dev_dax(dev); + int rc; + + rc =3D down_read_interruptible(&dax_dev_rwsem); + if (rc) + return rc; + + for (int i =3D 0; i < dev_dax->nr_range; i++) { + struct dax_resource *r =3D dev_dax->ranges[i].dax_resource; + + if (r && !uuid_is_null(&r->uuid)) { + rc =3D sysfs_emit(buf, "%pUb\n", &r->uuid); + goto out; + } + } + rc =3D sysfs_emit(buf, "0\n"); +out: + up_read(&dax_dev_rwsem); + return rc; +} + +static ssize_t uuid_claim_untagged(struct dax_region *dax_region, + struct dev_dax *dev_dax) +{ + struct dax_uuid_match match =3D { + .dax_region =3D dax_region, + .uuid =3D &uuid_null, + }; + struct dax_resource *dax_resource; + resource_size_t to_alloc; + struct device *extent_dev; + ssize_t alloc; + + extent_dev =3D device_find_child(dax_region->dev, &match, + find_uuid_extent); + if (!extent_dev) + return -ENOENT; + + dax_resource =3D dev_get_drvdata(extent_dev); + to_alloc =3D resource_size(dax_resource->res); + alloc =3D __dev_dax_resize(dax_resource->res, dev_dax, to_alloc, + dax_resource); + put_device(extent_dev); + if (alloc < 0) + return alloc; + if (alloc =3D=3D 0) + return -ENOENT; + dax_resource->use_cnt++; + return 0; +} + +static ssize_t uuid_claim_tagged(struct dax_region *dax_region, + struct dev_dax *dev_dax, const uuid_t *uuid) +{ + struct dax_tag_collect c =3D { + .dax_region =3D dax_region, + .uuid =3D uuid, + }; + unsigned int i; + ssize_t rc; + + /* Two-pass: count, then collect into a sized array. */ + device_for_each_child(dax_region->dev, &c, count_uuid_extent); + if (!c.count) + return -ENOENT; + + c.arr =3D kmalloc_array(c.count, sizeof(*c.arr), GFP_KERNEL); + if (!c.arr) + return -ENOMEM; + c.cap =3D c.count; + c.count =3D 0; + + rc =3D device_for_each_child(dax_region->dev, &c, collect_uuid_extent); + if (rc) + goto out; + + sort(c.arr, c.count, sizeof(*c.arr), dax_resource_seq_cmp, NULL); + + /* + * Tagged groups carry a dense 1..n @seq_num regardless of source + * (sharable: device-stamped; non-sharable: host-assigned in + * arrival order =E2=80=94 see &struct dax_resource). A gap or + * out-of-range value here means an extent went missing on the + * cxl side (e.g. a per-extent failure in cxl_add_pending) or a + * cxl-side validation gap; in either case refuse the whole + * group rather than carve a partial allocation. + */ + for (i =3D 0; i < c.count; i++) { + if (c.arr[i]->seq_num !=3D i + 1) { + dev_WARN_ONCE(dax_region->dev, 1, + "tag %pUb seq invariant violated at slot %u (got %u)\n", + uuid, i, c.arr[i]->seq_num); + rc =3D -EINVAL; + goto out; + } + } + + for (i =3D 0; i < c.count; i++) { + resource_size_t to_alloc =3D resource_size(c.arr[i]->res); + ssize_t alloc; + + alloc =3D __dev_dax_resize(c.arr[i]->res, dev_dax, to_alloc, + c.arr[i]); + if (alloc < 0) { + rc =3D alloc; + goto rollback; + } + if (alloc =3D=3D 0) { + rc =3D -ENOSPC; + goto rollback; + } + c.arr[i]->use_cnt++; + } + rc =3D 0; + goto out; + +rollback: + /* + * Partial failure: trim every range we added in this attempt. + * trim_dev_dax_range pops the most-recently-appended range from + * dev_dax->ranges[] and decrements its dax_resource->use_cnt, so + * looping until we have undone @i additions restores both + * dev_dax->ranges[] and the matched dax_resources' use_cnt. + */ + while (i-- > 0) + trim_dev_dax_range(dev_dax); +out: + kfree(c.arr); + return rc; } =20 static ssize_t uuid_store(struct device *dev, struct device_attribute *att= r, const char *buf, size_t len) { - return -EOPNOTSUPP; + struct dev_dax *dev_dax =3D to_dev_dax(dev); + struct dax_region *dax_region =3D dev_dax->region; + uuid_t uuid; + ssize_t rc; + + if (!is_dynamic(dax_region)) + return -EOPNOTSUPP; + + if (sysfs_streq(buf, "0")) + uuid_copy(&uuid, &uuid_null); + else { + rc =3D uuid_parse(buf, &uuid); + if (rc) + return rc; + } + + rc =3D down_write_killable(&dax_region_rwsem); + if (rc) + return rc; + if (!dax_region->dev->driver) { + rc =3D -ENXIO; + goto err_region; + } + rc =3D down_write_killable(&dax_dev_rwsem); + if (rc) + goto err_region; + + if (uuid_is_null(&uuid)) + rc =3D uuid_claim_untagged(dax_region, dev_dax); + else + rc =3D uuid_claim_tagged(dax_region, dev_dax, &uuid); + + up_write(&dax_dev_rwsem); +err_region: + up_write(&dax_region_rwsem); + + return rc < 0 ? rc : len; } static DEVICE_ATTR_RW(uuid); =20 @@ -1614,8 +1862,12 @@ static umode_t dev_dax_visible(struct kobject *kobj,= struct attribute *a, int n) return 0; if (a =3D=3D &dev_attr_mapping.attr && is_dynamic(dax_region)) return 0; - if ((a =3D=3D &dev_attr_align.attr || - a =3D=3D &dev_attr_size.attr) && is_static(dax_region)) + if (a =3D=3D &dev_attr_uuid.attr && !is_dynamic(dax_region)) + return 0444; + if (a =3D=3D &dev_attr_align.attr && + (is_static(dax_region) || is_dynamic(dax_region))) + return 0444; + if (a =3D=3D &dev_attr_size.attr && is_static(dax_region)) return 0444; return a->mode; } --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f50.google.com (mail-dl1-f50.google.com [74.125.82.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B48FD3911D3 for ; Sat, 23 May 2026 09:44:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529472; cv=none; b=IOdVsgaZpszjYchpxf4mCySNDOwla0e8/r8s21YZ0Gz+ydL2eIOJDb40XKiCZAu0Kg9/ApcDpzs6WFMzjZkVK60ORED0p/of80om5eMLL9kiZTjR1SGyGQoGpR078oVjq5V6Ue3o9Pq5p9A4kbcr5H25ZizDUXlpTPkVobDz944= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529472; c=relaxed/simple; bh=JPCMU4iD/OhP7hoGghQn0xJwqOz/NcDzAZp6Xaopfi4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FFeXlf13RkwS/fzH6aqOgR4xK2f+4bn82zMPXYYZfHiwqb80+8jqlzSfSpjTkckr4/QLk33qZdvBLTeIokWwKa8BDYMw3HFoIWC26zajLDvVrSethEiR+rNMTfQnvJ4oJMCHhJ3iwv+696joItE5SEbdMSeiYTUJJwn3rAldEHU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PPtYlbBo; arc=none smtp.client-ip=74.125.82.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PPtYlbBo" Received: by mail-dl1-f50.google.com with SMTP id a92af1059eb24-13621cca8f5so5371133c88.0 for ; Sat, 23 May 2026 02:44:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529469; x=1780134269; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/w4XKguC5ZxUtQmxS7Ocj+WzmWedODSvERTnM77yobw=; b=PPtYlbBojFK2j6XgwFheLSWqwfhpsXzlx+knpNtWUB9eJ3+WQfDwilgcZjlpZ9La9T LmpyXKU0qf+sZm30cdIl0NXSS/WWWjEI38XMvr7S49OUGDV9IYHAjxPSwJDyjkc8WIC4 NW2YfFU5ojZNRmIWcFMTZSbjOg+yGsDLQ6qi4wZ5YffzNnOKA0urTXyT0lxh/MIq2VH6 JRi9jCEp8N1uoHfrIvxA+y6mIQ2GT+4zj6q49kg/Io4zV1gupaZ+HwIimB1Eo85Y0qFH Sexg/Ecjmd3yE/YKVIyGIg78Gbv/5Ua9n1nlVt0ZznENxVBkrdMXzLFyCFtd1+ot/X90 DH1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529469; x=1780134269; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=/w4XKguC5ZxUtQmxS7Ocj+WzmWedODSvERTnM77yobw=; b=q8ggoq458IWyisMCuKRZI0L8ghIZDwO11dvLOfrQSvTiR+2OQiO50Ld6aJYVrYM+Mb fXPsnmX3daKk8524ScA9pElqInHv4pTpQp4CspKk908Nwg89tFaMcyCW7Tw2Tu9qdt4R Fel0E/qPgGOrpX/VXdipCPi7e+tbY050TIht/Q2fT6MlEk5le5zy5XCuvjrbEMjf/nHi kIYutleqGoN5C26IE8yORvFK8P+KPKBJKYB+RVVKlTJn5WH2Png+grN4NY5X5hwnuNqu lxUbF5bSbyz54OQU2jquHb1ihzMOGAyJEq7ZFN5POe28QUT6GP/c29js5jjaeGBuFFw6 1Rhw== X-Forwarded-Encrypted: i=1; AFNElJ8IU15E57LU73MwhIklLrnE1vsZrkkbvIYUWj2fd9ZP5SxlKzvfxAtdHVqPNwkjqHJWPfWLFPjp/qi08Kc=@vger.kernel.org X-Gm-Message-State: AOJu0Yzx8YEqYki2UKkN8ExsE+o1GCDpPqMhnIgImE6v6metEp3vGFm6 LzDNBBsVvlrzmo4UhrrzSP76E5xkYYOWtqe3p5wUCI3y3oc975T0i/X4 X-Gm-Gg: Acq92OHLzgC/OX3z5UOGtCRSNDey1VVd1NoH8hUcoNpY8XycKP9ULJeaYKYiQWZyVp2 IuVwGScev/kybZiPhGTQ1mUCDKSRmK96bLaDp8Pz0gHAOCJyfxpzKPQK3tetfjqsdHF7nCTqPWW K311QO8bRwGG0HhiazKMHakOQap1s193sQR1MWeyx3FhL+8VuiEiKfWGHdRJ8IxXu+hvzZW93Wl TePmsZPgsz23r8RA3faPU2CANd40tbXgzJE+9KE5Oiu+62FnxUHVT9KqPn7n+1Ouxa2J1RmspGl NS31MaMsjCoTO2f+vtXzqcRTPQ4qqP7OGD7wKoUGXs49H1FVG7Sj9F/IV9u7vDIfZYvvF+vsyhw GT/9WT7LG8TDFyaCv6popUG4u1VjEcxFFy87JtQWg73VNpTKjr38CU5gicCW0O0qrx3InThinuJ b2cCY580c5NxWVhk8IMHg20vMdclARuVAjvn/lsFP9+czIQTIVsHDFexvCRTboEr5Qdx6h6eZ4p v1soeM= X-Received: by 2002:a05:7022:ef06:b0:12d:b66f:35d7 with SMTP id a92af1059eb24-1365f820d0emr2667680c88.10.1779529468827; Sat, 23 May 2026 02:44:28 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:28 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron , Fan Ni Subject: [PATCH v10 27/31] cxl/region: Read existing extents on region creation Date: Sat, 23 May 2026 02:43:21 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny Dynamic capacity device extents may be left in an accepted state on a device due to an unexpected host crash. In this case it is expected that the creation of a new region on top of a DC partition can read those extents and surface them for continued use. Once all endpoint decoders are part of a region and the region is being realized, a read of the 'devices extent list' can reveal these previously accepted extents. CXL r3.1 specifies the mailbox call Get Dynamic Capacity Extent List for this purpose. The call returns all the extents for all dynamic capacity partitions. If the fabric manager is adding extents to any DCD partition, the extent list for the recovered region may change. In this case the query must retry. Upon retry the query could encounter extents which were accepted on a previous list query. Adding such extents is ignored without error because they are entirely within a previous accepted extent. Instead warn on this case to allow for differentiating bad devices from this normal condition. Latch any errors to be bubbled up to ensure notification to the user even if individual errors are rate limited or otherwise ignored. The scan for existing extents races with the dax_cxl driver. This is synchronized through the region device lock. Extents which are found after the driver has loaded will surface through the normal notification path while extents seen prior to the driver are read during driver load. Based on an original patch by Navneet Singh. Reviewed-by: Jonathan Cameron Reviewed-by: Fan Ni Signed-off-by: Ira Weiny --- drivers/cxl/core/core.h | 1 + drivers/cxl/core/mbox.c | 116 ++++++++++++++++++++++++++++++++++ drivers/cxl/core/region_dax.c | 27 ++++++++ drivers/cxl/cxlmem.h | 21 ++++++ 4 files changed, 165 insertions(+) diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index c28e357c5817..f5b05de5ed83 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -28,6 +28,7 @@ cxled_to_mds(struct cxl_endpoint_decoder *cxled) return container_of(cxlds, struct cxl_memdev_state, cxlds); } =20 +int cxl_process_extent_list(struct cxl_endpoint_decoder *cxled); int cxl_region_invalidate_memregion(struct cxl_region *cxlr); =20 #ifdef CONFIG_CXL_REGION diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 8071c1ed1b36..486110e1c03d 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -2083,6 +2083,122 @@ int cxl_dev_dc_identify(struct cxl_mailbox *mbox, } EXPORT_SYMBOL_NS_GPL(cxl_dev_dc_identify, "CXL"); =20 +/* Return -EAGAIN if the extent list changes while reading */ +static int __cxl_process_extent_list(struct cxl_endpoint_decoder *cxled) +{ + u32 current_index, total_read, total_expected, initial_gen_num; + struct cxl_memdev_state *mds =3D cxled_to_mds(cxled); + struct cxl_mailbox *cxl_mbox =3D &mds->cxlds.cxl_mbox; + struct device *dev =3D mds->cxlds.dev; + struct cxl_mbox_cmd mbox_cmd; + u32 max_extent_count; + int latched_rc =3D 0; + bool first =3D true; + + struct cxl_mbox_get_extent_out *extents __free(kvfree) =3D + kvmalloc(cxl_mbox->payload_size, GFP_KERNEL); + if (!extents) + return -ENOMEM; + + total_read =3D 0; + current_index =3D 0; + total_expected =3D 0; + max_extent_count =3D (cxl_mbox->payload_size - sizeof(*extents)) / + sizeof(struct cxl_extent); + do { + u32 nr_returned, current_total, current_gen_num; + struct cxl_mbox_get_extent_in get_extent; + int rc; + + get_extent =3D (struct cxl_mbox_get_extent_in) { + .extent_cnt =3D cpu_to_le32(max(max_extent_count, + total_expected - current_index)), + .start_extent_index =3D cpu_to_le32(current_index), + }; + + mbox_cmd =3D (struct cxl_mbox_cmd) { + .opcode =3D CXL_MBOX_OP_GET_DC_EXTENT_LIST, + .payload_in =3D &get_extent, + .size_in =3D sizeof(get_extent), + .size_out =3D cxl_mbox->payload_size, + .payload_out =3D extents, + .min_out =3D 1, + }; + + rc =3D cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); + if (rc < 0) + return rc; + + /* Save initial data */ + if (first) { + total_expected =3D le32_to_cpu(extents->total_extent_count); + initial_gen_num =3D le32_to_cpu(extents->generation_num); + first =3D false; + } + + nr_returned =3D le32_to_cpu(extents->returned_extent_count); + total_read +=3D nr_returned; + current_total =3D le32_to_cpu(extents->total_extent_count); + current_gen_num =3D le32_to_cpu(extents->generation_num); + + dev_dbg(dev, "Got extent list %d-%d of %d generation Num:%d\n", + current_index, total_read - 1, current_total, current_gen_num); + + if (current_gen_num !=3D initial_gen_num || total_expected !=3D current_= total) { + dev_warn(dev, "Extent list change detected; gen %u !=3D %u : cnt %u != =3D %u\n", + current_gen_num, initial_gen_num, + total_expected, current_total); + return -EAGAIN; + } + + for (int i =3D 0; i < nr_returned ; i++) { + struct cxl_extent *extent =3D &extents->extent[i]; + + dev_dbg(dev, "Processing extent %d/%d\n", + current_index + i, total_expected); + + rc =3D add_to_pending_list(&mds->add_ctx.pending_extents, + extent); + if (rc) { + latched_rc =3D rc; + } + } + + current_index +=3D nr_returned; + } while (total_expected > total_read); + + if (!latched_rc && !list_empty(&mds->add_ctx.pending_extents)) { + latched_rc =3D cxl_add_pending(mds); + } + clear_pending_extents(mds); + + return latched_rc; +} + +#define CXL_READ_EXTENT_LIST_RETRY 10 + +/** + * cxl_process_extent_list() - Read existing extents + * @cxled: Endpoint decoder which is part of a region + * + * Issue the Get Dynamic Capacity Extent List command to the device + * and add existing extents if found. + * + * A retry of 10 is somewhat arbitrary, however, extent changes should be + * relatively rare while bringing up a region. So 10 should be plenty. + */ +int cxl_process_extent_list(struct cxl_endpoint_decoder *cxled) +{ + int retry =3D CXL_READ_EXTENT_LIST_RETRY; + int rc; + + do { + rc =3D __cxl_process_extent_list(cxled); + } while (rc =3D=3D -EAGAIN && retry--); + + return rc; +} + static void add_part(struct cxl_dpa_info *info, u64 start, u64 size, enum = cxl_partition_mode mode) { int i =3D info->nr_partitions; diff --git a/drivers/cxl/core/region_dax.c b/drivers/cxl/core/region_dax.c index 519e203c486a..e7a812e8b2e7 100644 --- a/drivers/cxl/core/region_dax.c +++ b/drivers/cxl/core/region_dax.c @@ -82,6 +82,26 @@ static void cxlr_dax_unregister(void *_cxlr_dax) device_unregister(&cxlr_dax->dev); } =20 +static int cxlr_add_existing_extents(struct cxl_region *cxlr) +{ + struct cxl_region_params *p =3D &cxlr->params; + int i, latched_rc =3D 0; + + for (i =3D 0; i < p->nr_targets; i++) { + struct device *dev =3D &p->targets[i]->cxld.dev; + int rc; + + rc =3D cxl_process_extent_list(p->targets[i]); + if (rc) { + dev_err(dev, "Existing extent processing failed %d\n", + rc); + latched_rc =3D rc; + } + } + + return latched_rc; +} + int devm_cxl_add_dax_region(struct cxl_region *cxlr) { struct device *dev; @@ -110,6 +130,13 @@ int devm_cxl_add_dax_region(struct cxl_region *cxlr) dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent), dev_name(dev)); =20 + if (cxlr->mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A) { + rc =3D cxlr_add_existing_extents(cxlr); + if (rc) + dev_err(&cxlr->dev, + "Existing extent processing failed %d\n", rc); + } + return devm_add_action_or_reset(&cxlr->dev, cxlr_dax_unregister, no_free_ptr(cxlr_dax)); } diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index d992cc9b7811..1ad3dc7e413c 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -564,6 +564,27 @@ struct cxl_mbox_dc_response { } __packed extent_list[] __counted_by(extent_list_size); } __packed; =20 +/* + * Get Dynamic Capacity Extent List; Input Payload + * CXL rev 3.1 section 8.2.9.9.9.2; Table 8-166 + */ +struct cxl_mbox_get_extent_in { + __le32 extent_cnt; + __le32 start_extent_index; +} __packed; + +/* + * Get Dynamic Capacity Extent List; Output Payload + * CXL rev 3.1 section 8.2.9.9.9.2; Table 8-167 + */ +struct cxl_mbox_get_extent_out { + __le32 returned_extent_count; + __le32 total_extent_count; + __le32 generation_num; + u8 rsvd[4]; + struct cxl_extent extent[]; +} __packed; + struct cxl_mbox_get_supported_logs { __le16 entries; u8 rsvd[6]; --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f43.google.com (mail-dl1-f43.google.com [74.125.82.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 96F6C3921E7 for ; Sat, 23 May 2026 09:44:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529474; cv=none; b=OkJt8IiRKE+h8bXTuUN/4UlUyroox9sTo6ZixaYCUHQN/O2/P9CE8RuEIaotC2+FebSjA1xG05ecZrw0eGTtYG9T4rAQnmkbBhWmuqVZ3SdgH+Zm+D6hvDEYJ8bgSyYTZkV/2W9zmF3iKYodWWMJZGNuBQVCefzNaZyotLq4FPw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529474; c=relaxed/simple; bh=9Vatsa44EJ16grZ7fWkxrmMrz8YLYk1hyZyIQz21lpY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WDXKFi1rrMD/a3bMdgTKEco1SxpgMwCAfIqcjDFid+hE8VWDcN0ek2tRoyjeCx4xCHwG+74Np2h6G1ui32/FbQ/cyiNrOTC5SvuItdVx0eQUT9No64dIvYZmdBHOO4DsqAXKaYcakNN/n8C/Gt7idvpIwjUkaAtOsU15KMJrSvo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IaepYzQ/; arc=none smtp.client-ip=74.125.82.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IaepYzQ/" Received: by mail-dl1-f43.google.com with SMTP id a92af1059eb24-133466cf955so24036037c88.0 for ; Sat, 23 May 2026 02:44:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529472; x=1780134272; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZVlxVqb/noEE3wo4FJD/Az4q6LLE6ik5ntk4peDmfoE=; b=IaepYzQ/ma7R79pvLb8r5QefnuDhui5FXVhIxsOWZH0LsH7HphQG78iDwP8lgSvK1k h4w5Ub+RusMjtF1L9V9Sc4iqxzTSufPNflDbzZIOt52Yige0OpgsrXJFTXJYUaQZb1Ms DlPBW6IRHATxCYz1Acep95fcVB85rvR9umhLaoS/db9o2zPFvbvWOswpEh009Q3bqFEc 8auQI2ZglvxKazJfrs3qEzhIQwmbMHFBTbwjkk5/sTCY4S2C9idq8HI8PCTzLj4YOpp1 neNIHWPdggh27CaY6mIrkQlsGHC4KympXDgyF/M1ljrBc92lXhzNeb32rZOF2hhtwYwC hdyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529472; x=1780134272; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ZVlxVqb/noEE3wo4FJD/Az4q6LLE6ik5ntk4peDmfoE=; b=V7agyy9gQxjGM1EYQalF5ke4/yiCsTsd/jk7iv5woElKBSBQHF4c/AoWyui3P+0htG HQZKZcOog2g3OULKGte70Rb8IF1dvb/ZxYH4+4a2LsnBFFWoIEfaYKmHMMlSxIw5QMUg bqpTYZV03+n8Ysn1x605F6m7Ox3VKs104nyfDKAfIYyFmHR41t0kWwfYnfOzl/wpXx6g rMQ8a1g0SIVVBkSGLXJz/IuCi6Rk3H9DMppknL9HH/9hF2OJtXKB9Wkl5bpU0qOsOZEf /YhiWmbazBqQvwjB2IheSqyvoxnT7ChgD2eTnPnASzA6NkfAIg+/MWXBxGq1QesyAtPX gsnw== X-Forwarded-Encrypted: i=1; AFNElJ/NOOqBmizTS291rTnY5D9gSUrL81zuMVZvfgYJ2JEHxdTrdDliH6QxYMr/c2sBUjbufUb6dzdpcHPi1Vw=@vger.kernel.org X-Gm-Message-State: AOJu0Yz8tTpEU9azxguufhWSwy75bMP0wZe3URXDyo4NkhAgyOUoTiOh ySg/gQcZzKkTahRl/HNtabc1RDcUrDLmfEUoBZhQUVQRy91R4ZQWU9HQ X-Gm-Gg: Acq92OE2pGTJHyyRFPe1PxbnFnjdyC+F+PQhvajB4/mqy5Ll/FbgPOjdSJvr3i7W/Y6 kXD3CO3sk04ErVPrv9Bjb14osm7rlbuN5M5PQ8/dD61uGiHywbO+kV7IqtbaO5WX8uq8+aktEys rpKg27Kk92tdoCjobcxzMwvf0ezeKNKRAAirdsSlT1afPDlPGPRR7YSijFIOZHkeWuUeI4h8Pn/ Zy25sowNAqrr+9XN/tJ/YN9k6wYlPr5OJ8bFm5fGfqQI9C4to1kcI2ynG5ZAslDlEFMu6GhfQ5Q T/odnT/PSoMw5p9xt8RoM37PdE0lKlqEWDh+qRznw74g6KOSO1pRiDIcU+oRhfI7MzVog84v03E 8ZQDwmzUIDDj3L6+oAmyZtt3hPVnwbRSuBScBfM59GyvTrEw0jqj4vTiSMjnvLUhSiFOZf614Aw ODCf/EG6LGaZKL8rYX3Xnb2tHrlu7gUM5isY7k5oBDNwBTIWjGLtXb0nvqdfal75JJhL7CmWfkI +YCC5WJVExP0/Zq5Q== X-Received: by 2002:a05:7022:ff45:b0:12d:b7e5:a691 with SMTP id a92af1059eb24-1365f80f603mr2633678c88.7.1779529471639; Sat, 23 May 2026 02:44:31 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:30 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron , Fan Ni Subject: [PATCH v10 28/31] cxl/mem: Trace Dynamic capacity Event Record Date: Sat, 23 May 2026 02:43:22 -0700 Message-ID: <54f9e863fac7a9c040267a13cd36aa7415e29f4f.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny CXL rev 3.1 section 8.2.9.2.1 adds the Dynamic Capacity Event Records. User space can use trace events for debugging of DC capacity changes. Add DC trace points to the trace log. Based on an original patch by Navneet Singh. Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang Reviewed-by: Fan Ni Signed-off-by: Ira Weiny --- drivers/cxl/core/mbox.c | 5 ++++ drivers/cxl/core/trace.h | 65 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 486110e1c03d..271f4556db85 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -1030,6 +1030,11 @@ static void __cxl_event_trace_record(struct cxl_memd= ev *cxlmd, ev_type =3D CXL_CPER_EVENT_MEM_MODULE; else if (uuid_equal(uuid, &CXL_EVENT_MEM_SPARING_UUID)) ev_type =3D CXL_CPER_EVENT_MEM_SPARING; + else if (uuid_equal(uuid, &CXL_EVENT_DC_EVENT_UUID)) { +/* FIXME still valid? */ + trace_cxl_dynamic_capacity(cxlmd, type, &record->event.dcd); + return; + } =20 cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event); } diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index a972e4ef1936..421e492d1b3f 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -1099,6 +1099,71 @@ TRACE_EVENT(cxl_poison, ) ); =20 +/* + * Dynamic Capacity Event Record - DER + * + * CXL rev 3.1 section 8.2.9.2.1.6 Table 8-50 + */ + +#define CXL_DC_ADD_CAPACITY 0x00 +#define CXL_DC_REL_CAPACITY 0x01 +#define CXL_DC_FORCED_REL_CAPACITY 0x02 +#define CXL_DC_REG_CONF_UPDATED 0x03 +#define show_dc_evt_type(type) __print_symbolic(type, \ + { CXL_DC_ADD_CAPACITY, "Add capacity"}, \ + { CXL_DC_REL_CAPACITY, "Release capacity"}, \ + { CXL_DC_FORCED_REL_CAPACITY, "Forced capacity release"}, \ + { CXL_DC_REG_CONF_UPDATED, "Region Configuration Updated" } \ +) + +TRACE_EVENT(cxl_dynamic_capacity, + + TP_PROTO(const struct cxl_memdev *cxlmd, enum cxl_event_log_type log, + struct cxl_event_dcd *rec), + + TP_ARGS(cxlmd, log, rec), + + TP_STRUCT__entry( + CXL_EVT_TP_entry + + /* Dynamic capacity Event */ + __field(u8, event_type) + __field(u16, hostid) + __field(u8, partition_id) + __field(u64, dpa_start) + __field(u64, length) + __array(u8, uuid, UUID_SIZE) + __field(u16, sh_extent_seq) + ), + + TP_fast_assign( + CXL_EVT_TP_fast_assign(cxlmd, log, rec->hdr); + + /* Dynamic_capacity Event */ + __entry->event_type =3D rec->event_type; + + /* DCD event record data */ + __entry->hostid =3D le16_to_cpu(rec->host_id); + __entry->partition_id =3D rec->partition_index; + __entry->dpa_start =3D le64_to_cpu(rec->extent.start_dpa); + __entry->length =3D le64_to_cpu(rec->extent.length); + memcpy(__entry->uuid, &rec->extent.uuid, UUID_SIZE); + __entry->sh_extent_seq =3D le16_to_cpu(rec->extent.shared_extn_seq); + ), + + CXL_EVT_TP_printk("event_type=3D'%s' host_id=3D'%d' partition_id=3D'%d' "= \ + "starting_dpa=3D%llx length=3D%llx tag=3D%pU " \ + "shared_extent_sequence=3D%d", + show_dc_evt_type(__entry->event_type), + __entry->hostid, + __entry->partition_id, + __entry->dpa_start, + __entry->length, + __entry->uuid, + __entry->sh_extent_seq + ) +); + #endif /* _CXL_EVENTS_H */ =20 #define TRACE_INCLUDE_FILE trace --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dl1-f51.google.com (mail-dl1-f51.google.com [74.125.82.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE9E238F64E for ; Sat, 23 May 2026 09:44:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529476; cv=none; b=Ov6RJRXamBlV+eBx1t4dty/sv7J7h4KI8W16LQKtjc+gbfMffuEKCq7Tr0j16WyQH+VG9nB58Ox9oKLPxc/9YlXEYvryldFAZsEMrfvkxHddteI94ah/HR4FSnzAJsXIw3ES8TlMGZuRO09C4yjLQRhzuR7Z4R8jcfKHLNs4Gks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529476; c=relaxed/simple; bh=cOFNXdBj4qha6TGxX2ZSw+zwsAD8nUAMxe5/btjr9Co=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Xc9dURCmsdtGFXxw0FLB/WQb8YXHRK3uAct7UtCpr4JvLL1Vgu7VCPSL1Peklr0jrEuni3pUpHvAVSCJEl+/Q9E6HCMTddndyqGmrMsUNGYzb81O2sa0LL7VXctBjGOF4plwz5mUkNQ/TcHP8SveJnbq9emARpuUrPB+cj1Q4NY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=oExGbUnB; arc=none smtp.client-ip=74.125.82.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oExGbUnB" Received: by mail-dl1-f51.google.com with SMTP id a92af1059eb24-1353c2f35cfso1059677c88.1 for ; Sat, 23 May 2026 02:44:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529473; x=1780134273; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jHudOBLt79OQ68nY4yqgXRwcgWLKxKpKlgc/lDj0j6E=; b=oExGbUnBdy6DdAF74f9udTS4JBjDdaQGxRCLOSAEpffpBT0W0jk23Yct0J65T1v3ul LBUSVyxMi9XlhYbJO23Pn2WUDgkKx2fS8SilwavBLGtOOTH+txR2wkeL69vOwYtt88Td 4eMcz+22168Mjwc0hmCdaMEC1F77NWDOawRk0tAx1FDISupQAGJPB+DQr4InhHWIhvoX 8ldZAtYHj77wq0iCKjRZMOgvUwMRNu7v1+1KPIqAdrUs3zEUwE5n552Nlil2qAEdIOat movc0SBJ+ZZuLyNvr4Ha6rRWOk2nr9baaic78vJZ1nz9nIusdxD2wxfmgyfrJhTbhv5E pplA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529473; x=1780134273; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jHudOBLt79OQ68nY4yqgXRwcgWLKxKpKlgc/lDj0j6E=; b=JMfPWVnXClXYRR+Dq158Qo10jco2upoZf1kuj0D6HsjDMly679AawOlr8UarU1Wsa9 LEt0UCpa5dfBy9xp5Fz1s3UbWvOHqM2quk5c0HVK6+WqclcDbnjulyVtzZrBNr2i7RH4 H+3KOztizAob2Miz1MKTGOyj9a/gqHmrJlbMJLtQNflTV1YtyF8imFGK0ZP8jgP78lsW LVwLpxPgpIr6albHQdkatHxEgp1aWzYl62I75EG+jx+2fnSuyY+4KMxB1F89BQiq49eU dmqmdo9NYqytwRT7jih7ustD0rvnL6iW0nTSBHjC9uuxTIuAqHrHtSJDSnOdN5d5kBuP 5sqw== X-Forwarded-Encrypted: i=1; AFNElJ8CE+I5RNior4lBOz43eiCnWSMjwKea+/xg79oY7hXOHWcLAfk5HiETvoXNS/btRrGYCmDSvI4hZZXIaTg=@vger.kernel.org X-Gm-Message-State: AOJu0Yxjj4An6VdvmqCUbmGejnFTEUcFw+92OlT29wA3hV9OtprikziT x0ggoh0aQ5Vsj8W9XIBWlePDx7hBA0t1FeyVuE0fAKnHO6k6dVKLvDfZ X-Gm-Gg: Acq92OEgB5c/up+GwsfXEM5zIPy6EnmBvN7MC5+ETIByCltRLjlFWF7AElnLnqx6egK FUbEBQNkHlOcGpUH1cwJwAR6f3jufUiE1FrFygKIPAmegiNzIYl6zOnvKPp+fbjFfsnTl+efQnz +L83yMAHqJZaMGmIYVQx8W0cVBz6/kiOdCaJMStEvVjTPo8PvV/TsgIi5eYEHsvSHNDGYQ63wA1 T1ThueXvkH9FSmDajTRXCX1QlsG9uhtnV+0Ao0UqwFfQv4e/wSJiifRCTYfXsei/IDBq8Uuh3/J M5VVDhDOULWn9sGM7T93FzutPIG6KQ8/oWFc6kx1w2fMbk7e6UwiVbNV7jxUwnkpPdKqyqbnRaa b+5SLRu0gTF2ry3uVSu9R2LM6ndErsfeotMcQ8PEGoMMOwjdbFEW8aNhNWbhol2tj9knaJkPSZ1 ZrCBLwVQ3tJjdWR8ESFZwH4GvaCw1toDKMPELuXGkeVQrDf1hXRQGb1euTtVG8lb4b2gjfPZnPb HGXurSNwbW6siihcg== X-Received: by 2002:a05:7022:43a9:b0:136:ac69:b100 with SMTP id a92af1059eb24-136ac69b350mr37275c88.13.1779529473202; Sat, 23 May 2026 02:44:33 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:32 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Jonathan Cameron Subject: [PATCH v10 29/31] tools/testing/cxl: Make event logs dynamic Date: Sat, 23 May 2026 02:43:23 -0700 Message-ID: <41c47ec44202b7a2491f89752247d8968758e213.1779528761.git.anisa.su@samsung.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ira Weiny The event logs test was created as static arrays as an easy way to mock events. Dynamic Capacity Device (DCD) test support requires events be generated dynamically when extents are created or destroyed. The current event log test has specific checks for the number of events seen including log overflow. Modify mock event logs to be dynamically allocated. Adjust array size and mock event entry data to match the output expected by the existing event test. Use the static event data to create the dynamic events in the new logs without inventing complex event injection for the previous tests. Simplify log processing by using the event log array index as the handle. Add a lock to manage concurrency required when user space is allowed to control DCD extents Reviewed-by: Jonathan Cameron Reviewed-by: Dave Jiang Signed-off-by: Ira Weiny --- tools/testing/cxl/test/mem.c | 265 +++++++++++++++++++++-------------- 1 file changed, 161 insertions(+), 104 deletions(-) diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index 271c7ad8cc32..fe1dadddd18e 100644 --- a/tools/testing/cxl/test/mem.c +++ b/tools/testing/cxl/test/mem.c @@ -142,18 +142,26 @@ static struct { =20 #define PASS_TRY_LIMIT 3 =20 -#define CXL_TEST_EVENT_CNT_MAX 15 +#define CXL_TEST_EVENT_CNT_MAX 16 +/* 1 extra slot to accommodate that handles can't be 0 */ +#define CXL_TEST_EVENT_ARRAY_SIZE (CXL_TEST_EVENT_CNT_MAX + 1) =20 /* Set a number of events to return at a time for simulation. */ #define CXL_TEST_EVENT_RET_MAX 4 =20 +/* + * @last_handle: last handle (index) to have an entry stored + * @current_handle: current handle (index) to be returned to the user on g= et_event + * @nr_overflow: number of events added past the log size + * @lock: protect these state variables + * @events: array of pending events to be returned. + */ struct mock_event_log { - u16 clear_idx; - u16 cur_idx; - u16 nr_events; + u16 last_handle; + u16 current_handle; u16 nr_overflow; - u16 overflow_reset; - struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX]; + rwlock_t lock; + struct cxl_event_record_raw *events[CXL_TEST_EVENT_ARRAY_SIZE]; }; =20 struct mock_event_store { @@ -194,56 +202,65 @@ static struct mock_event_log *event_find_log(struct d= evice *dev, int log_type) return &mdata->mes.mock_logs[log_type]; } =20 -static struct cxl_event_record_raw *event_get_current(struct mock_event_lo= g *log) -{ - return log->events[log->cur_idx]; -} - -static void event_reset_log(struct mock_event_log *log) -{ - log->cur_idx =3D 0; - log->clear_idx =3D 0; - log->nr_overflow =3D log->overflow_reset; -} - /* Handle can never be 0 use 1 based indexing for handle */ -static u16 event_get_clear_handle(struct mock_event_log *log) +static u16 event_inc_handle(u16 handle) { - return log->clear_idx + 1; + handle =3D (handle + 1) % CXL_TEST_EVENT_ARRAY_SIZE; + if (handle =3D=3D 0) + handle =3D 1; + return handle; } =20 -/* Handle can never be 0 use 1 based indexing for handle */ -static __le16 event_get_cur_event_handle(struct mock_event_log *log) -{ - u16 cur_handle =3D log->cur_idx + 1; - - return cpu_to_le16(cur_handle); -} - -static bool event_log_empty(struct mock_event_log *log) -{ - return log->cur_idx =3D=3D log->nr_events; -} - -static void mes_add_event(struct mock_event_store *mes, +/* Add the event or free it on overflow */ +static void mes_add_event(struct cxl_mockmem_data *mdata, enum cxl_event_log_type log_type, struct cxl_event_record_raw *event) { + struct device *dev =3D mdata->mds->cxlds.dev; struct mock_event_log *log; =20 if (WARN_ON(log_type >=3D CXL_EVENT_TYPE_MAX)) return; =20 - log =3D &mes->mock_logs[log_type]; + log =3D &mdata->mes.mock_logs[log_type]; + + guard(write_lock)(&log->lock); =20 - if ((log->nr_events + 1) > CXL_TEST_EVENT_CNT_MAX) { + dev_dbg(dev, "Add log %d cur %d last %d\n", + log_type, log->current_handle, log->last_handle); + + /* Check next buffer */ + if (event_inc_handle(log->last_handle) =3D=3D log->current_handle) { log->nr_overflow++; - log->overflow_reset =3D log->nr_overflow; + dev_dbg(dev, "Overflowing log %d nr %d\n", + log_type, log->nr_overflow); + devm_kfree(dev, event); return; } =20 - log->events[log->nr_events] =3D event; - log->nr_events++; + dev_dbg(dev, "Log %d; handle %u\n", log_type, log->last_handle); + event->event.generic.hdr.handle =3D cpu_to_le16(log->last_handle); + log->events[log->last_handle] =3D event; + log->last_handle =3D event_inc_handle(log->last_handle); +} + +static void mes_del_event(struct device *dev, + struct mock_event_log *log, + u16 handle) +{ + struct cxl_event_record_raw *record; + + lockdep_assert(lockdep_is_held(&log->lock)); + + dev_dbg(dev, "Clearing event %u; record %u\n", + handle, log->current_handle); + record =3D log->events[handle]; + if (!record) + dev_err(dev, "Mock event index %u empty?\n", handle); + + log->events[handle] =3D NULL; + log->current_handle =3D event_inc_handle(log->current_handle); + devm_kfree(dev, record); } =20 /* @@ -257,6 +274,7 @@ static int mock_get_event(struct device *dev, struct cx= l_mbox_cmd *cmd) struct cxl_get_event_payload *pl; struct mock_event_log *log; int ret_limit; + u16 handle; u8 log_type; int i; =20 @@ -276,22 +294,31 @@ static int mock_get_event(struct device *dev, struct = cxl_mbox_cmd *cmd) memset(cmd->payload_out, 0, struct_size(pl, records, 0)); =20 log =3D event_find_log(dev, log_type); - if (!log || event_log_empty(log)) + if (!log) return 0; =20 pl =3D cmd->payload_out; =20 - for (i =3D 0; i < ret_limit && !event_log_empty(log); i++) { - memcpy(&pl->records[i], event_get_current(log), - sizeof(pl->records[i])); - pl->records[i].event.generic.hdr.handle =3D - event_get_cur_event_handle(log); - log->cur_idx++; + guard(read_lock)(&log->lock); + + handle =3D log->current_handle; + dev_dbg(dev, "Get log %d handle %u last %u\n", + log_type, handle, log->last_handle); + for (i =3D 0; i < ret_limit && handle !=3D log->last_handle; + i++, handle =3D event_inc_handle(handle)) { + struct cxl_event_record_raw *cur; + + cur =3D log->events[handle]; + dev_dbg(dev, "Sending event log %d handle %d idx %u\n", + log_type, le16_to_cpu(cur->event.generic.hdr.handle), + handle); + memcpy(&pl->records[i], cur, sizeof(pl->records[i])); + pl->records[i].event.generic.hdr.handle =3D cpu_to_le16(handle); } =20 cmd->size_out =3D struct_size(pl, records, i); pl->record_count =3D cpu_to_le16(i); - if (!event_log_empty(log)) + if (handle !=3D log->last_handle) pl->flags |=3D CXL_GET_EVENT_FLAG_MORE_RECORDS; =20 if (log->nr_overflow) { @@ -313,8 +340,8 @@ static int mock_get_event(struct device *dev, struct cx= l_mbox_cmd *cmd) static int mock_clear_event(struct device *dev, struct cxl_mbox_cmd *cmd) { struct cxl_mbox_clear_event_payload *pl =3D cmd->payload_in; - struct mock_event_log *log; u8 log_type =3D pl->event_log; + struct mock_event_log *log; u16 handle; int nr; =20 @@ -325,23 +352,20 @@ static int mock_clear_event(struct device *dev, struc= t cxl_mbox_cmd *cmd) if (!log) return 0; /* No mock data in this log */ =20 - /* - * This check is technically not invalid per the specification AFAICS. - * (The host could 'guess' handles and clear them in order). - * However, this is not good behavior for the host so test it. - */ - if (log->clear_idx + pl->nr_recs > log->cur_idx) { - dev_err(dev, - "Attempting to clear more events than returned!\n"); - return -EINVAL; - } + guard(write_lock)(&log->lock); =20 /* Check handle order prior to clearing events */ - for (nr =3D 0, handle =3D event_get_clear_handle(log); - nr < pl->nr_recs; - nr++, handle++) { + handle =3D log->current_handle; + for (nr =3D 0; nr < pl->nr_recs && handle !=3D log->last_handle; + nr++, handle =3D event_inc_handle(handle)) { + + dev_dbg(dev, "Checking clear of %d handle %u plhandle %u\n", + log_type, handle, + le16_to_cpu(pl->handles[nr])); + if (handle !=3D le16_to_cpu(pl->handles[nr])) { - dev_err(dev, "Clearing events out of order\n"); + dev_err(dev, "Clearing events out of order %u %u\n", + handle, le16_to_cpu(pl->handles[nr])); return -EINVAL; } } @@ -350,25 +374,12 @@ static int mock_clear_event(struct device *dev, struc= t cxl_mbox_cmd *cmd) log->nr_overflow =3D 0; =20 /* Clear events */ - log->clear_idx +=3D pl->nr_recs; - return 0; -} - -static void cxl_mock_event_trigger(struct device *dev) -{ - struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); - struct mock_event_store *mes =3D &mdata->mes; - int i; - - for (i =3D CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) { - struct mock_event_log *log; + for (nr =3D 0; nr < pl->nr_recs; nr++) + mes_del_event(dev, log, le16_to_cpu(pl->handles[nr])); + dev_dbg(dev, "Delete log %d cur %d last %d\n", + log_type, log->current_handle, log->last_handle); =20 - log =3D event_find_log(dev, i); - if (log) - event_reset_log(log); - } - - cxl_mem_get_event_records(mdata->mds, mes->ev_status); + return 0; } =20 struct cxl_event_record_raw maint_needed =3D { @@ -509,8 +520,27 @@ static int mock_set_timestamp(struct cxl_dev_state *cx= lds, return 0; } =20 -static void cxl_mock_add_event_logs(struct mock_event_store *mes) +/* Create a dynamically allocated event out of a statically defined event.= */ +static void add_event_from_static(struct cxl_mockmem_data *mdata, + enum cxl_event_log_type log_type, + struct cxl_event_record_raw *raw) { + struct device *dev =3D mdata->mds->cxlds.dev; + struct cxl_event_record_raw *rec; + + rec =3D devm_kmemdup(dev, raw, sizeof(*rec), GFP_KERNEL); + if (!rec) { + dev_err(dev, "Failed to alloc event for log\n"); + return; + } + mes_add_event(mdata, log_type, rec); +} + +static void cxl_mock_add_event_logs(struct cxl_mockmem_data *mdata) +{ + struct mock_event_store *mes =3D &mdata->mes; + struct device *dev =3D mdata->mds->cxlds.dev; + put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK | CXL_GMER_VALID_COMPONENT | CXL_GMER_VALID_COMPONENT_ID_FORMAT, &gen_media.rec.media_hdr.validity_flags); @@ -523,43 +553,60 @@ static void cxl_mock_add_event_logs(struct mock_event= _store *mes) put_unaligned_le16(CXL_MMER_VALID_COMPONENT | CXL_MMER_VALID_COMPONENT_ID= _FORMAT, &mem_module.rec.validity_flags); =20 - mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed); - mes_add_event(mes, CXL_EVENT_TYPE_INFO, + dev_dbg(dev, "Generating fake event logs %d\n", + CXL_EVENT_TYPE_INFO); + add_event_from_static(mdata, CXL_EVENT_TYPE_INFO, &maint_needed); + add_event_from_static(mdata, CXL_EVENT_TYPE_INFO, (struct cxl_event_record_raw *)&gen_media); - mes_add_event(mes, CXL_EVENT_TYPE_INFO, + add_event_from_static(mdata, CXL_EVENT_TYPE_INFO, (struct cxl_event_record_raw *)&mem_module); mes->ev_status |=3D CXLDEV_EVENT_STATUS_INFO; =20 - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &maint_needed); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, + dev_dbg(dev, "Generating fake event logs %d\n", + CXL_EVENT_TYPE_FAIL); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &maint_needed); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, + (struct cxl_event_record_raw *)&mem_module); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, (struct cxl_event_record_raw *)&dram); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, (struct cxl_event_record_raw *)&gen_media); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, (struct cxl_event_record_raw *)&mem_module); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, (struct cxl_event_record_raw *)&dram); /* Overflow this log */ - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FAIL, &hardware_replace); mes->ev_status |=3D CXLDEV_EVENT_STATUS_FAIL; =20 - mes_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace); - mes_add_event(mes, CXL_EVENT_TYPE_FATAL, + dev_dbg(dev, "Generating fake event logs %d\n", + CXL_EVENT_TYPE_FATAL); + add_event_from_static(mdata, CXL_EVENT_TYPE_FATAL, &hardware_replace); + add_event_from_static(mdata, CXL_EVENT_TYPE_FATAL, (struct cxl_event_record_raw *)&dram); mes->ev_status |=3D CXLDEV_EVENT_STATUS_FATAL; } =20 +static void cxl_mock_event_trigger(struct device *dev) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct mock_event_store *mes =3D &mdata->mes; + + cxl_mock_add_event_logs(mdata); + cxl_mem_get_event_records(mdata->mds, mes->ev_status); +} + static int mock_gsl(struct cxl_mbox_cmd *cmd) { if (cmd->size_out < sizeof(mock_gsl_payload)) @@ -1684,6 +1731,14 @@ static void cxl_mock_test_feat_init(struct cxl_mockm= em_data *mdata) mdata->test_feat.data =3D cpu_to_le32(0xdeadbeef); } =20 +static void init_event_log(struct mock_event_log *log) +{ + rwlock_init(&log->lock); + /* Handle can never be 0 use 1 based indexing for handle */ + log->current_handle =3D 1; + log->last_handle =3D 1; +} + static int cxl_mock_mem_probe(struct platform_device *pdev) { struct device *dev =3D &pdev->dev; @@ -1767,7 +1822,9 @@ static int cxl_mock_mem_probe(struct platform_device = *pdev) if (rc) dev_dbg(dev, "No CXL Features discovered\n"); =20 - cxl_mock_add_event_logs(&mdata->mes); + for (int i =3D 0; i < CXL_EVENT_TYPE_MAX; i++) + init_event_log(&mdata->mes.mock_logs[i]); + cxl_mock_add_event_logs(mdata); =20 cxlmd =3D devm_cxl_add_memdev(cxlds, NULL); if (IS_ERR(cxlmd)) --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dy1-f171.google.com (mail-dy1-f171.google.com [74.125.82.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25CF13932C9 for ; Sat, 23 May 2026 09:44:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529480; cv=none; b=rE6vyEemxa3OUzL7NJOaqrBII6xFUHsL/UtMSDkaGnGnZzbt6B3sEsKjj23VjjosOuRMC90cNiLYmZA2CXPiIk9AzKBLTZf+iYN2G6pZAtX7WoxwCnr7mlQIXWIYzjyG0AdY4loyOsVpQJOyqavB4LNam4jXZJ/VtYt37OzCINU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529480; c=relaxed/simple; bh=IE31zLQ8b5City3YceOuiDfyz8n4EFb6nKKOmshO/v0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mo3qmQ+B/JRKnK0oQqJYN9lw2fq9L7b430j/tNSKn4sLvlyMzS5Df1OpDmBIwPusycAQ+e9AlekZ4yYHTeBsCMFU35bt2Mbf8mxQS7LTdlg7TeSJwSZjKEYoXxzo6+99lKv5GqIYMPKw/DyG0zqWOQY9M851/+BD/avXU1+tRFc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kBZbROHz; arc=none smtp.client-ip=74.125.82.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kBZbROHz" Received: by mail-dy1-f171.google.com with SMTP id 5a478bee46e88-2f0ad52830cso10487763eec.1 for ; Sat, 23 May 2026 02:44:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529475; x=1780134275; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vRo7Rd9YqShqpiqZKCPk6am0B8cUoLr1eHpVW7X6FPw=; b=kBZbROHzKiOtIe9EkAtV2C4NE2QN6zUE8Oy56EbQzqJQI7uqL//XbLPnJXwcvkv6ag yDPlfDtMiCCGGA4+h73mRT8yecCncAjSQdBrSQl+LG8Yu2lDeSJTwxBxtSNLFU8CTcVx 0wtE70VXnc1JAO0PN9gNOvdouN9TIfQ/CG8b9sIs9QwiJjgMflrfu61sR9q+vr+bWth9 eCelO+zRYI7pfAJxjISPmUqARAZOHuBr/pDxy6P/zb+0L2dCaZojybXqCa4OvDnwZBc0 I/Gjm8RE8olCxPcjNL+7h0mS8xRLI6D7yQVXdsVbUTNGLNEm7llXup2haqr0dQij/e43 hLeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529475; x=1780134275; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=vRo7Rd9YqShqpiqZKCPk6am0B8cUoLr1eHpVW7X6FPw=; b=Q5nQiTptOiAFcqax58UYqFFcfXwlMEmYKqvxJZOpoWvnzNTaHORVsWJOQnUCVRyiv3 qKQK4gnAOLV459Z0I0KkONQk0AJQSZ7Ejsq6Q7xG709lkujtbMDugSLBZ8Jrh7bEQLq4 nP8EwRvSEsoghBsZLfEiKjgWVW2zYRyVWP85d8UyRC+kF4szHMM0IJOt18BXrHtvMK2U nf9ZUs//2VJ9TJcOPJO+SefFXgfDY8lCsHZcpmUgAAwKFtE0YrvyLQ/bcUbINfJRfc+g 4KD/vq6AU7T2cItooQKH5kPyInWh54DCbc8XQqABp5UzFsGeZEcEqsD9mqFo6x0ULLq2 VwZg== X-Forwarded-Encrypted: i=1; AFNElJ8uvvadLYqsGHK5bxIJ80OT0A/hskzw/7c6ZSy+HRKz0wnvD+3hqh8gtInx3AuAQn63cFJaXob0X/vbf8c=@vger.kernel.org X-Gm-Message-State: AOJu0Ywz5Sds7i9gC7FDG5SUhXR4QpCm6dHlCDxokvJ1jJV4GAdVW/ym B0t2c1hWU+vc/R4NAW4vbMtU0Eo8GjmPb1OlpyykqiQdlzYIpclnUxKSnlq01Q== X-Gm-Gg: Acq92OFYOSc753GfSoLST3GkENHolo2CFACnAHtJUkbG1O1UP1XGU+BsMMS8l1XY96M MvuTtMvY3jQGqzNbCQu5QHSiyJFwJj5BjfWx/R9kVMNGOm7P32507l+8ZigwjEZGH2dJ3bJq1zA TtCI6vZX7qrZwdKiMFApre4o1Dl+8i0MHXvgfxGfoE4Hfj5RcAvH9R2YGdPEGsYTCfNuOt6w+Ls 99yOC0wKAZCSL/5untc4VMfLJYAf54YbLY3VY0EsPDpb8ug5bdMljzaCChfNmLC2ru0ZcaB5Wgw F4pqJr8ZixV6ugTAZ73zWao07cTvW8KU1Qy9Dqh6uC74+Z1icX7CwOfP4dSQmselYIddvmlUVPL CZm44ln11qa+e4f29eRDSbL4934652JsIrqOBNeRZ8sqZyNIH2nujOo9UwPbD4opTwZduJ7htuH AJUdLZH/DT6iMrOVjRc1eWjWiTPg2Dp5DEcr8WkGW0amv8f2EMgBRnHH1gJ5EruKAztBttECVq1 dTmNf4= X-Received: by 2002:a05:7301:129b:b0:2f2:5c68:5074 with SMTP id 5a478bee46e88-3044909063cmr2975304eec.13.1779529474879; Sat, 23 May 2026 02:44:34 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:34 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Ira Weiny , Anisa Su Subject: [PATCH v10 30/31] tools/testing/cxl: Add DC Regions to mock mem data Date: Sat, 23 May 2026 02:43:24 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Ira Weiny cxl_test provides a good way to ensure quick smoke and regression testing. The complexity of Dynamic Capacity (DC) extent processing as well as the complexity of DC-backed DAX regions can mostly be tested through cxl_test. This includes management of DC regions and DAX devices on those regions; the management of extent device lifetimes; and the processing of DCD events. The only missing functionality from this test is actual interrupt processing. Mock memory devices can easily mock DC information and manage fake extent data. Define mock_dc_partition information within the mock memory data. Add sysfs entries on the mock device to inject and delete extents. The inject format is :::[:] where is a UUID string (or "" / "0" for the null UUID) and is an optional shared_extn_seq value used for sharable-partition tests (defaults to 0). The delete format is :: Directly call the event irq callback to simulate irqs to process the test extents. Add DC mailbox commands to the CEL and implement those commands. Signed-off-by: Ira Weiny Signed-off-by: Anisa Su --- Changes: [anisa: add uuid + shared_extn_seq, align mock with kernel validators, introduce a sharable-partition test fixture] [anisa: replace "sparse" terminology with "DC" / "DC-backed"] Carry a uuid_t and a u16 shared_extn_seq on each mock extent, parse tags via uuid_parse() in the inject path and the pre-extent fixture, and propagate both fields through log_dc_event() and mock_get_dc_extent_list(). An optional 5th field in the inject format supplies the shared_extn_seq for sharable-partition tests. The delete format takes the uuid as its third field so release events carry tag identity to the host. Mock fixes required to satisfy the host-side validators: - dsmad_handle starts at 0xFA, not 0xFADE. The Get Dynamic Capacity Configuration response's DSMAD Handle field is 1 byte per the CXL spec; the kernel rejects any handle with the upper 24 bits non-zero as a firmware-bug. - dc_accept_extent() treats a re-accept of an already-accepted extent as a successful no-op (look up dc_accepted_exts when the sent xa lookup misses). The host replays accepts for pre- injected extents on region creation; without this the existing- extent ingest aborts with -ENOMEM. - __dc_del_extent_store() runs strim() on the trailing uuid field so the ' ' shell write tail doesn't cause parse_tag() to fall through to uuid_parse() and -EINVAL. - NUM_MOCK_DC_REGIONS reduced from 2 to 1. The host's cxl_dev_dc_identify() surfaces partitions[0] only, so extents seeded into a second mock partition land outside the registered DC range; for tagged groups that also trips the partition- equality gate and drops the whole group (including the in-range member). Sharable-partition test fixture: - Stamp MOCK_DC_SHARABLE_SERIAL (0xDCDC) on the cxl_mem instance at pdev->id =3D=3D 0. The companion cxl_test driver checks this serial in mock_cxl_endpoint_parse_cdat() and sets the DC partition's perf.shareable on that memdev only =E2=80=94 exposing both sharable and non-sharable DC partitions from one cxl_test module load so the userspace suite can exercise both regimes. - Skip inject_prev_extents() on that one memdev: the pre-injected extents are untagged / seq=3D0 and would be rejected as firmware- bug by cxl_validate_extent() on a sharable partition, leaving spurious noise in dmesg at probe. --- tools/testing/cxl/test/cxl.c | 21 + tools/testing/cxl/test/mem.c | 806 ++++++++++++++++++++++++++++++++++- 2 files changed, 826 insertions(+), 1 deletion(-) diff --git a/tools/testing/cxl/test/cxl.c b/tools/testing/cxl/test/cxl.c index 418669927fb0..ac6060ede061 100644 --- a/tools/testing/cxl/test/cxl.c +++ b/tools/testing/cxl/test/cxl.c @@ -18,6 +18,15 @@ static int interleave_arithmetic; static bool extended_linear_cache; static bool fail_autoassemble; =20 +/* + * Mock serial sentinel. The cxl_mock_mem probe stamps this serial on + * exactly one platform device (cxl_mem with id 0); that single memdev's + * DC partition is marked sharable below in mock_cxl_endpoint_parse_cdat + * so the suite can exercise sharable-extent code paths without losing + * the non-sharable coverage on the other mock memdevs. + */ +#define MOCK_DC_SHARABLE_SERIAL 0xDCDCULL + #define FAKE_QTG_ID 42 =20 #define NR_CXL_HOST_BRIDGES 2 @@ -1432,6 +1441,18 @@ static void mock_cxl_endpoint_parse_cdat(struct cxl_= port *port) }; =20 dpa_perf_setup(port, &range, perf); + + /* + * The mock probe stamps MOCK_DC_SHARABLE_SERIAL onto exactly + * one cxl_mem instance; mark its DC partition sharable so + * cxl_validate_extent() routes shared-seq injects through + * the sharable regime. Every other memdev keeps its DC + * partition non-sharable so the existing untagged / seq=3D0 + * tests still run on this kernel. + */ + if (cxlds->part[i].mode =3D=3D CXL_PARTMODE_DYNAMIC_RAM_A && + cxlds->serial =3D=3D MOCK_DC_SHARABLE_SERIAL) + perf->shareable =3D true; } =20 cxl_memdev_update_perf(cxlmd); diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c index fe1dadddd18e..9cc97b718b5f 100644 --- a/tools/testing/cxl/test/mem.c +++ b/tools/testing/cxl/test/mem.c @@ -20,6 +20,7 @@ #define FW_SLOTS 3 #define DEV_SIZE SZ_2G #define EFFECT(x) (1U << x) +#define BASE_DYNAMIC_CAP_DPA DEV_SIZE =20 #define MOCK_INJECT_DEV_MAX 8 #define MOCK_INJECT_TEST_MAX 128 @@ -113,6 +114,22 @@ static struct cxl_cel_entry mock_cel[] =3D { EFFECT(SECURITY_CHANGE_IMMEDIATE) | EFFECT(BACKGROUND_OP)), }, + { + .opcode =3D cpu_to_le16(CXL_MBOX_OP_GET_DC_CONFIG), + .effect =3D CXL_CMD_EFFECT_NONE, + }, + { + .opcode =3D cpu_to_le16(CXL_MBOX_OP_GET_DC_EXTENT_LIST), + .effect =3D CXL_CMD_EFFECT_NONE, + }, + { + .opcode =3D cpu_to_le16(CXL_MBOX_OP_ADD_DC_RESPONSE), + .effect =3D cpu_to_le16(EFFECT(CONF_CHANGE_IMMEDIATE)), + }, + { + .opcode =3D cpu_to_le16(CXL_MBOX_OP_RELEASE_DC), + .effect =3D cpu_to_le16(EFFECT(CONF_CHANGE_IMMEDIATE)), + }, }; =20 /* See CXL 2.0 Table 181 Get Health Info Output Payload */ @@ -173,6 +190,16 @@ struct vendor_test_feat { __le32 data; } __packed; =20 +/* + * The kernel surfaces only the first DC partition reported by the + * device (cxl_dev_dc_identify() takes partitions[0] only), so any + * extents we pre-inject into a second mock partition end up rejected + * as "not in a valid DC partition" =E2=80=94 and for tagged groups they a= lso + * trip the partition-equality gate and drop the whole group (including + * the in-range member in DC0). Keep the mock at one DC partition. + */ +#define NUM_MOCK_DC_REGIONS 1 + struct cxl_mockmem_data { void *lsa; void *fw; @@ -191,6 +218,20 @@ struct cxl_mockmem_data { unsigned long sanitize_timeout; struct vendor_test_feat test_feat; u8 shutdown_state; + + struct cxl_dc_partition dc_partitions[NUM_MOCK_DC_REGIONS]; + u32 dc_ext_generation; + struct mutex ext_lock; + + /* + * Extents are in 1 of 3 states + * FM (sysfs added but not sent to the host yet) + * sent (sent to the host but not accepted) + * accepted (by the host) + */ + struct xarray dc_fm_extents; + struct xarray dc_sent_extents; + struct xarray dc_accepted_exts; }; =20 static struct mock_event_log *event_find_log(struct device *dev, int log_t= ype) @@ -607,6 +648,229 @@ static void cxl_mock_event_trigger(struct device *dev) cxl_mem_get_event_records(mdata->mds, mes->ev_status); } =20 +struct cxl_extent_data { + u64 dpa_start; + u64 length; + uuid_t uuid; + u16 shared_extn_seq; + bool shared; +}; + +/* + * Parse a tag string into a uuid_t. Accepts the empty string and "0" + * as shorthand for the null UUID; anything else must be a UUID string + * uuid_parse() can understand. + */ +static int parse_tag(const char *tag, uuid_t *out) +{ + if (!tag || tag[0] =3D=3D '\0' || strcmp(tag, "0") =3D=3D 0) { + uuid_copy(out, &uuid_null); + return 0; + } + return uuid_parse(tag, out); +} + +static int __devm_add_extent(struct device *dev, struct xarray *array, + u64 start, u64 length, const char *tag, + u16 shared_extn_seq, bool shared) +{ + struct cxl_extent_data *extent; + int rc; + + extent =3D devm_kzalloc(dev, sizeof(*extent), GFP_KERNEL); + if (!extent) + return -ENOMEM; + + extent->dpa_start =3D start; + extent->length =3D length; + rc =3D parse_tag(tag, &extent->uuid); + if (rc) { + dev_err(dev, "Failed to parse tag '%s'\n", tag); + devm_kfree(dev, extent); + return rc; + } + extent->shared_extn_seq =3D shared_extn_seq; + extent->shared =3D shared; + + if (xa_insert(array, start, extent, GFP_KERNEL)) { + devm_kfree(dev, extent); + dev_err(dev, "Failed xarry insert %#llx\n", start); + return -EINVAL; + } + + return 0; +} + +static int devm_add_fm_extent(struct device *dev, u64 start, u64 length, + const char *tag, u16 shared_extn_seq, bool shared) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + + guard(mutex)(&mdata->ext_lock); + return __devm_add_extent(dev, &mdata->dc_fm_extents, start, length, + tag, shared_extn_seq, shared); +} + +static int dc_accept_extent(struct device *dev, u64 start, u64 length) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_extent_data *ext; + + dev_dbg(dev, "Host accepting extent %#llx\n", start); + mdata->dc_ext_generation++; + + lockdep_assert_held(&mdata->ext_lock); + ext =3D xa_load(&mdata->dc_sent_extents, start); + if (!ext || ext->length !=3D length) { + /* + * The host may re-accept extents we already moved into the + * accepted xarray (e.g. pre-injected extents replayed on + * region creation). Treat that as a successful no-op so + * the existing-extent ingest path doesn't abort. + */ + ext =3D xa_load(&mdata->dc_accepted_exts, start); + if (ext && ext->length =3D=3D length) + return 0; + dev_err(dev, "Extent %#llx-%#llx not found\n", + start, start + length); + return -ENOMEM; + } + xa_erase(&mdata->dc_sent_extents, start); + return xa_insert(&mdata->dc_accepted_exts, start, ext, GFP_KERNEL); +} + +static void release_dc_ext(void *md) +{ + struct cxl_mockmem_data *mdata =3D md; + + xa_destroy(&mdata->dc_fm_extents); + xa_destroy(&mdata->dc_sent_extents); + xa_destroy(&mdata->dc_accepted_exts); +} + +/* Pretend to have some previous accepted extents */ +struct pre_ext_info { + u64 offset; + u64 length; + const char *tag; +} pre_ext_info[] =3D { + { + .offset =3D SZ_128M, + .length =3D SZ_64M, + .tag =3D "", + }, + { + .offset =3D SZ_256M, + .length =3D SZ_64M, + .tag =3D "deadbeef-cafe-baad-f00d-fedcba987654", + }, +}; + +static int devm_add_sent_extent(struct device *dev, u64 start, u64 length, + const char *tag, u16 shared_extn_seq, bool shared) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + + lockdep_assert_held(&mdata->ext_lock); + return __devm_add_extent(dev, &mdata->dc_sent_extents, start, length, + tag, shared_extn_seq, shared); +} + +static int inject_prev_extents(struct device *dev, u64 base_dpa) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + int rc; + + dev_dbg(dev, "Adding %ld pre-extents for testing\n", + ARRAY_SIZE(pre_ext_info)); + + guard(mutex)(&mdata->ext_lock); + for (int i =3D 0; i < ARRAY_SIZE(pre_ext_info); i++) { + u64 ext_dpa =3D base_dpa + pre_ext_info[i].offset; + u64 ext_len =3D pre_ext_info[i].length; + + dev_dbg(dev, "Adding pre-extent DPA:%#llx LEN:%#llx tag:%s\n", + ext_dpa, ext_len, pre_ext_info[i].tag); + + rc =3D devm_add_sent_extent(dev, ext_dpa, ext_len, + pre_ext_info[i].tag, 0, false); + if (rc) { + dev_err(dev, "Failed to add pre-extent DPA:%#llx LEN:%#llx; %d\n", + ext_dpa, ext_len, rc); + return rc; + } + + rc =3D dc_accept_extent(dev, ext_dpa, ext_len); + if (rc) + return rc; + } + return 0; +} + +static int cxl_mock_dc_partition_setup(struct device *dev) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + u64 base_dpa =3D BASE_DYNAMIC_CAP_DPA; + u32 dsmad_handle =3D 0xFA; + u64 decode_length =3D SZ_512M; + u64 block_size =3D SZ_512; + u64 length =3D SZ_512M; + int rc; + + mutex_init(&mdata->ext_lock); + xa_init(&mdata->dc_fm_extents); + xa_init(&mdata->dc_sent_extents); + xa_init(&mdata->dc_accepted_exts); + + rc =3D devm_add_action_or_reset(dev, release_dc_ext, mdata); + if (rc) + return rc; + + for (int i =3D 0; i < NUM_MOCK_DC_REGIONS; i++) { + struct cxl_dc_partition *part =3D &mdata->dc_partitions[i]; + + dev_dbg(dev, "Creating DC partition DC%d DPA:%#llx LEN:%#llx\n", + i, base_dpa, length); + + part->base =3D cpu_to_le64(base_dpa); + part->decode_length =3D cpu_to_le64(decode_length / + CXL_CAPACITY_MULTIPLIER); + part->length =3D cpu_to_le64(length); + part->block_size =3D cpu_to_le64(block_size); + part->dsmad_handle =3D cpu_to_le32(dsmad_handle); + dsmad_handle++; + + /* + * Skip pre-injection on the sharable mock memdev. The + * pre-injected extents are untagged / seq=3D0, which a + * sharable partition rejects as firmware-bug; leaving the + * sharable memdev with an empty DC partition is what its + * dedicated tests (test_shared_extent_inject and + * test_seq_integrity_gap in cxl-dcd.sh) expect anyway. + * + * The sharable fixture is the memdev at pdev->id =3D=3D 0 =E2=80=94 + * see the matching MOCK_DC_SHARABLE_SERIAL stamp in + * cxl_mock_mem_probe(). This relies on tools/testing/cxl + * always allocating a "cxl_mem" platform device with id 0 + * as the first memdev; if that invariant ever breaks the + * sharable test fixture will land on the wrong device. + */ + if (to_platform_device(dev)->id !=3D 0) { + rc =3D inject_prev_extents(dev, base_dpa); + if (rc) { + dev_err(dev, + "Failed to add pre-extents for DC%d\n", + i); + return rc; + } + } + + base_dpa +=3D decode_length; + } + + return 0; +} + static int mock_gsl(struct cxl_mbox_cmd *cmd) { if (cmd->size_out < sizeof(mock_gsl_payload)) @@ -1582,6 +1846,193 @@ static int mock_get_supported_features(struct cxl_m= ockmem_data *mdata, return 0; } =20 +static int mock_get_dc_config(struct device *dev, + struct cxl_mbox_cmd *cmd) +{ + struct cxl_mbox_get_dc_config_in *dc_config =3D cmd->payload_in; + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + u8 partition_requested, partition_start_idx, partition_ret_cnt; + struct cxl_mbox_get_dc_config_out *resp; + int i; + + partition_requested =3D min(dc_config->partition_count, NUM_MOCK_DC_REGIO= NS); + + if (cmd->size_out < struct_size(resp, partition, partition_requested)) + return -EINVAL; + + memset(cmd->payload_out, 0, cmd->size_out); + resp =3D cmd->payload_out; + + partition_start_idx =3D dc_config->start_partition_index; + partition_ret_cnt =3D 0; + for (i =3D 0; i < NUM_MOCK_DC_REGIONS; i++) { + if (i >=3D partition_start_idx) { + memcpy(&resp->partition[partition_ret_cnt], + &mdata->dc_partitions[i], + sizeof(resp->partition[partition_ret_cnt])); + partition_ret_cnt++; + } + } + resp->avail_partition_count =3D NUM_MOCK_DC_REGIONS; + resp->partitions_returned =3D i; + + dev_dbg(dev, "Returning %d dc partitions\n", partition_ret_cnt); + return 0; +} + +static int mock_get_dc_extent_list(struct device *dev, + struct cxl_mbox_cmd *cmd) +{ + struct cxl_mbox_get_extent_out *resp =3D cmd->payload_out; + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_mbox_get_extent_in *get =3D cmd->payload_in; + u32 total_avail =3D 0, total_ret =3D 0; + struct cxl_extent_data *ext; + u32 ext_count, start_idx; + unsigned long i; + + ext_count =3D le32_to_cpu(get->extent_cnt); + start_idx =3D le32_to_cpu(get->start_extent_index); + + memset(resp, 0, sizeof(*resp)); + + guard(mutex)(&mdata->ext_lock); + /* + * Total available needs to be calculated and returned regardless of + * how many can actually be returned. + */ + xa_for_each(&mdata->dc_accepted_exts, i, ext) + total_avail++; + + if (start_idx > total_avail) + return -EINVAL; + + xa_for_each(&mdata->dc_accepted_exts, i, ext) { + if (total_ret >=3D ext_count) + break; + + if (total_ret >=3D start_idx) { + resp->extent[total_ret].start_dpa =3D + cpu_to_le64(ext->dpa_start); + resp->extent[total_ret].length =3D + cpu_to_le64(ext->length); + export_uuid(resp->extent[total_ret].uuid, &ext->uuid); + resp->extent[total_ret].shared_extn_seq =3D + cpu_to_le16(ext->shared_extn_seq); + total_ret++; + } + } + + resp->returned_extent_count =3D cpu_to_le32(total_ret); + resp->total_extent_count =3D cpu_to_le32(total_avail); + resp->generation_num =3D cpu_to_le32(mdata->dc_ext_generation); + + dev_dbg(dev, "Returning %d extents of %d total\n", + total_ret, total_avail); + + return 0; +} + +static void dc_clear_sent(struct device *dev) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_extent_data *ext; + unsigned long index; + + lockdep_assert_held(&mdata->ext_lock); + + /* Any extents not accepted must be cleared */ + xa_for_each(&mdata->dc_sent_extents, index, ext) { + dev_dbg(dev, "Host rejected extent %#llx\n", ext->dpa_start); + xa_erase(&mdata->dc_sent_extents, ext->dpa_start); + } +} + +static int mock_add_dc_response(struct device *dev, + struct cxl_mbox_cmd *cmd) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_mbox_dc_response *req =3D cmd->payload_in; + u32 list_size =3D le32_to_cpu(req->extent_list_size); + + guard(mutex)(&mdata->ext_lock); + for (int i =3D 0; i < list_size; i++) { + u64 start =3D le64_to_cpu(req->extent_list[i].dpa_start); + u64 length =3D le64_to_cpu(req->extent_list[i].length); + int rc; + + rc =3D dc_accept_extent(dev, start, length); + if (rc) + return rc; + } + + dc_clear_sent(dev); + return 0; +} + +static void dc_delete_extent(struct device *dev, unsigned long long start, + unsigned long long length) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + unsigned long long end =3D start + length; + struct cxl_extent_data *ext; + unsigned long index; + + dev_dbg(dev, "Deleting extent at %#llx len:%#llx\n", start, length); + + guard(mutex)(&mdata->ext_lock); + xa_for_each(&mdata->dc_fm_extents, index, ext) { + u64 extent_end =3D ext->dpa_start + ext->length; + + /* + * Any extent which 'touches' the released delete range will be + * removed. + */ + if ((start <=3D ext->dpa_start && ext->dpa_start < end) || + (start <=3D extent_end && extent_end < end)) + xa_erase(&mdata->dc_fm_extents, ext->dpa_start); + } + + /* + * If the extent was accepted let it be for the host to drop + * later. + */ +} + +static int release_accepted_extent(struct device *dev, u64 start, u64 leng= th) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_extent_data *ext; + + guard(mutex)(&mdata->ext_lock); + ext =3D xa_load(&mdata->dc_accepted_exts, start); + if (!ext || ext->length !=3D length) { + dev_err(dev, "Extent %#llx not in accepted state\n", start); + return -EINVAL; + } + xa_erase(&mdata->dc_accepted_exts, start); + mdata->dc_ext_generation++; + + return 0; +} + +static int mock_dc_release(struct device *dev, + struct cxl_mbox_cmd *cmd) +{ + struct cxl_mbox_dc_response *req =3D cmd->payload_in; + u32 list_size =3D le32_to_cpu(req->extent_list_size); + + for (int i =3D 0; i < list_size; i++) { + u64 start =3D le64_to_cpu(req->extent_list[i].dpa_start); + u64 length =3D le64_to_cpu(req->extent_list[i].length); + + dev_dbg(dev, "Extent %#llx released by host\n", start); + release_accepted_extent(dev, start, length); + } + + return 0; +} + static int cxl_mock_mbox_send(struct cxl_mailbox *cxl_mbox, struct cxl_mbox_cmd *cmd) { @@ -1673,6 +2124,18 @@ static int cxl_mock_mbox_send(struct cxl_mailbox *cx= l_mbox, case CXL_MBOX_OP_GET_SUPPORTED_FEATURES: rc =3D mock_get_supported_features(mdata, cmd); break; + case CXL_MBOX_OP_GET_DC_CONFIG: + rc =3D mock_get_dc_config(dev, cmd); + break; + case CXL_MBOX_OP_GET_DC_EXTENT_LIST: + rc =3D mock_get_dc_extent_list(dev, cmd); + break; + case CXL_MBOX_OP_ADD_DC_RESPONSE: + rc =3D mock_add_dc_response(dev, cmd); + break; + case CXL_MBOX_OP_RELEASE_DC: + rc =3D mock_dc_release(dev, cmd); + break; case CXL_MBOX_OP_GET_FEATURE: rc =3D mock_get_feature(mdata, cmd); break; @@ -1739,6 +2202,14 @@ static void init_event_log(struct mock_event_log *lo= g) log->last_handle =3D 1; } =20 +/* + * Stamp this serial on a single mock cxl_mem instance so the + * companion cxl_test driver can find it and mark its DC partition + * sharable in mock_cxl_endpoint_parse_cdat(). Must match the value + * defined in tools/testing/cxl/test/cxl.c. + */ +#define MOCK_DC_SHARABLE_SERIAL 0xDCDCULL + static int cxl_mock_mem_probe(struct platform_device *pdev) { struct device *dev =3D &pdev->dev; @@ -1758,6 +2229,10 @@ static int cxl_mock_mem_probe(struct platform_device= *pdev) return -ENOMEM; dev_set_drvdata(dev, mdata); =20 + rc =3D cxl_mock_dc_partition_setup(dev); + if (rc) + return rc; + mdata->lsa =3D vmalloc(LSA_SIZE); if (!mdata->lsa) return -ENOMEM; @@ -1774,7 +2249,23 @@ static int cxl_mock_mem_probe(struct platform_device= *pdev) if (rc) return rc; =20 - mds =3D cxl_memdev_state_create(dev, pdev->id + 1, 0); + { + u64 serial =3D pdev->id + 1; + + /* + * Reserve the memdev at pdev->id =3D=3D 0 as the sharable DC + * partition test fixture. This relies on tools/testing/cxl + * always allocating a "cxl_mem" platform device with id 0 + * as the first memdev =E2=80=94 currently true in cxl.c, but if + * the topology ever renumbers, the sharable serial will be + * stamped on the wrong device (or no device). Matched by + * the skip-pre-inject guard in cxl_mock_dc_partition_setup + * and by mock_cxl_endpoint_parse_cdat in cxl_test. + */ + if (pdev->id =3D=3D 0) + serial =3D MOCK_DC_SHARABLE_SERIAL; + mds =3D cxl_memdev_state_create(dev, serial, 0); + } if (IS_ERR(mds)) return PTR_ERR(mds); =20 @@ -1814,6 +2305,9 @@ static int cxl_mock_mem_probe(struct platform_device = *pdev) if (rc) return rc; =20 + if (cxl_dcd_supported(mds)) + cxl_configure_dcd(mds, &range_info); + rc =3D cxl_dpa_setup(cxlds, &range_info); if (rc) return rc; @@ -1921,11 +2415,321 @@ static ssize_t sanitize_timeout_store(struct devic= e *dev, =20 static DEVICE_ATTR_RW(sanitize_timeout); =20 +/* Return if the proposed extent would break the test code */ +static bool new_extent_valid(struct device *dev, size_t new_start, + size_t new_len) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_extent_data *extent; + size_t new_end, i; + + if (!new_len) + return false; + + new_end =3D new_start + new_len; + + dev_dbg(dev, "New extent %zx-%zx\n", new_start, new_end); + + guard(mutex)(&mdata->ext_lock); + dev_dbg(dev, "Checking extents starts...\n"); + xa_for_each(&mdata->dc_fm_extents, i, extent) { + if (extent->dpa_start =3D=3D new_start) + return false; + } + + dev_dbg(dev, "Checking sent extents starts...\n"); + xa_for_each(&mdata->dc_sent_extents, i, extent) { + if (extent->dpa_start =3D=3D new_start) + return false; + } + + dev_dbg(dev, "Checking accepted extents starts...\n"); + xa_for_each(&mdata->dc_accepted_exts, i, extent) { + if (extent->dpa_start =3D=3D new_start) + return false; + } + + return true; +} + +struct cxl_test_dcd { + uuid_t id; + struct cxl_event_dcd rec; +} __packed; + +struct cxl_test_dcd dcd_event_rec_template =3D { + .id =3D CXL_EVENT_DC_EVENT_UUID, + .rec =3D { + .hdr =3D { + .length =3D sizeof(struct cxl_test_dcd), + }, + }, +}; + +static int log_dc_event(struct cxl_mockmem_data *mdata, enum dc_event type, + u64 start, u64 length, const char *tag_str, + u16 shared_extn_seq, bool more) +{ + struct device *dev =3D mdata->mds->cxlds.dev; + struct cxl_test_dcd *dcd_event; + uuid_t tag; + int rc; + + dev_dbg(dev, "mock device log event %d\n", type); + + dcd_event =3D devm_kmemdup(dev, &dcd_event_rec_template, + sizeof(*dcd_event), GFP_KERNEL); + if (!dcd_event) + return -ENOMEM; + + dcd_event->rec.flags =3D 0; + if (more) + dcd_event->rec.flags |=3D CXL_DCD_EVENT_MORE; + dcd_event->rec.event_type =3D type; + dcd_event->rec.extent.start_dpa =3D cpu_to_le64(start); + dcd_event->rec.extent.length =3D cpu_to_le64(length); + rc =3D parse_tag(tag_str, &tag); + if (rc) { + devm_kfree(dev, dcd_event); + return rc; + } + export_uuid(dcd_event->rec.extent.uuid, &tag); + dcd_event->rec.extent.shared_extn_seq =3D cpu_to_le16(shared_extn_seq); + + mes_add_event(mdata, CXL_EVENT_TYPE_DCD, + (struct cxl_event_record_raw *)dcd_event); + + /* Fake the irq */ + cxl_mem_get_event_records(mdata->mds, CXLDEV_EVENT_STATUS_DCD); + + return 0; +} + +static void mark_extent_sent(struct device *dev, unsigned long long start) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + struct cxl_extent_data *ext; + + guard(mutex)(&mdata->ext_lock); + ext =3D xa_erase(&mdata->dc_fm_extents, start); + if (xa_insert(&mdata->dc_sent_extents, ext->dpa_start, ext, GFP_KERNEL)) + dev_err(dev, "Failed to mark extent %#llx sent\n", ext->dpa_start); +} + +/* + * Format ::: + * + * start and length must be a multiple of the configured partition block s= ize. + * Tag can be any string up to 16 bytes. + * + * Extents must be exclusive of other extents + * + * If the more flag is specified it is expected that an additional extent = will + * be specified without the more flag to complete the test transaction wit= h the + * host. + */ +static ssize_t __dc_inject_extent_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count, + bool shared) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + unsigned long long start, length, more; + char *len_str, *uuid_str, *more_str, *seq_str; + u16 shared_extn_seq =3D 0; + size_t buf_len =3D count; + int rc; + + char *start_str __free(kfree) =3D kstrdup(buf, GFP_KERNEL); + if (!start_str) + return -ENOMEM; + + len_str =3D strnchr(start_str, buf_len, ':'); + if (!len_str) { + dev_err(dev, "Extent failed to find len_str: %s\n", start_str); + return -EINVAL; + } + + *len_str =3D '\0'; + len_str +=3D 1; + buf_len -=3D strlen(start_str); + + uuid_str =3D strnchr(len_str, buf_len, ':'); + if (!uuid_str) { + dev_err(dev, "Extent failed to find uuid_str: %s\n", len_str); + return -EINVAL; + } + *uuid_str =3D '\0'; + uuid_str +=3D 1; + + more_str =3D strnchr(uuid_str, buf_len, ':'); + if (!more_str) { + dev_err(dev, "Extent failed to find more_str: %s\n", uuid_str); + return -EINVAL; + } + *more_str =3D '\0'; + more_str +=3D 1; + + /* Optional 5th field: shared_extn_seq. Absent -> 0. */ + seq_str =3D strnchr(more_str, buf_len, ':'); + if (seq_str) { + unsigned long long seq; + + *seq_str =3D '\0'; + seq_str +=3D 1; + if (kstrtoull(seq_str, 0, &seq) || seq > U16_MAX) { + dev_err(dev, "Extent failed to parse seq: %s\n", + seq_str); + return -EINVAL; + } + shared_extn_seq =3D seq; + } + + if (kstrtoull(start_str, 0, &start)) { + dev_err(dev, "Extent failed to parse start: %s\n", start_str); + return -EINVAL; + } + + if (kstrtoull(len_str, 0, &length)) { + dev_err(dev, "Extent failed to parse length: %s\n", len_str); + return -EINVAL; + } + + if (kstrtoull(more_str, 0, &more)) { + dev_err(dev, "Extent failed to parse more: %s\n", more_str); + return -EINVAL; + } + + if (!new_extent_valid(dev, start, length)) + return -EINVAL; + + rc =3D devm_add_fm_extent(dev, start, length, uuid_str, shared_extn_seq, + shared); + if (rc) { + dev_err(dev, "Failed to add extent DPA:%#llx LEN:%#llx; %d\n", + start, length, rc); + return rc; + } + + mark_extent_sent(dev, start); + rc =3D log_dc_event(mdata, DCD_ADD_CAPACITY, start, length, uuid_str, + shared_extn_seq, more); + if (rc) { + dev_err(dev, "Failed to add event %d\n", rc); + return rc; + } + + return count; +} + +static ssize_t dc_inject_extent_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + return __dc_inject_extent_store(dev, attr, buf, count, false); +} +static DEVICE_ATTR_WO(dc_inject_extent); + +static ssize_t dc_inject_shared_extent_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + return __dc_inject_extent_store(dev, attr, buf, count, true); +} +static DEVICE_ATTR_WO(dc_inject_shared_extent); + +static ssize_t __dc_del_extent_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count, + enum dc_event type) +{ + struct cxl_mockmem_data *mdata =3D dev_get_drvdata(dev); + unsigned long long start, length; + char *len_str, *uuid_str; + size_t buf_len =3D count; + int rc; + + char *start_str __free(kfree) =3D kstrdup(buf, GFP_KERNEL); + if (!start_str) + return -ENOMEM; + + len_str =3D strnchr(start_str, buf_len, ':'); + if (!len_str) { + dev_err(dev, "Failed to find len_str: %s\n", start_str); + return -EINVAL; + } + *len_str =3D '\0'; + len_str +=3D 1; + buf_len -=3D strlen(start_str); + + uuid_str =3D strnchr(len_str, buf_len, ':'); + if (!uuid_str) { + dev_err(dev, "Failed to find uuid_str: %s\n", len_str); + return -EINVAL; + } + *uuid_str =3D '\0'; + uuid_str +=3D 1; + /* + * uuid_str is the trailing field; trim shell-added '\n' so + * parse_tag()/uuid_parse() see a clean string. + */ + uuid_str =3D strim(uuid_str); + + if (kstrtoull(start_str, 0, &start)) { + dev_err(dev, "Failed to parse start: %s\n", start_str); + return -EINVAL; + } + + if (kstrtoull(len_str, 0, &length)) { + dev_err(dev, "Failed to parse length: %s\n", len_str); + return -EINVAL; + } + + dc_delete_extent(dev, start, length); + + if (type =3D=3D DCD_FORCED_CAPACITY_RELEASE) + dev_dbg(dev, "Forcing delete of extent %#llx len:%#llx\n", + start, length); + + rc =3D log_dc_event(mdata, type, start, length, uuid_str, 0, false); + if (rc) { + dev_err(dev, "Failed to add event %d\n", rc); + return rc; + } + + return count; +} + +/* + * Format :: + */ +static ssize_t dc_del_extent_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + return __dc_del_extent_store(dev, attr, buf, count, + DCD_RELEASE_CAPACITY); +} +static DEVICE_ATTR_WO(dc_del_extent); + +static ssize_t dc_force_del_extent_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + return __dc_del_extent_store(dev, attr, buf, count, + DCD_FORCED_CAPACITY_RELEASE); +} +static DEVICE_ATTR_WO(dc_force_del_extent); + static struct attribute *cxl_mock_mem_attrs[] =3D { &dev_attr_security_lock.attr, &dev_attr_event_trigger.attr, &dev_attr_fw_buf_checksum.attr, &dev_attr_sanitize_timeout.attr, + &dev_attr_dc_inject_extent.attr, + &dev_attr_dc_inject_shared_extent.attr, + &dev_attr_dc_del_extent.attr, + &dev_attr_dc_force_del_extent.attr, NULL }; ATTRIBUTE_GROUPS(cxl_mock_mem); --=20 2.43.0 From nobody Sun May 24 20:33:06 2026 Received: from mail-dy1-f181.google.com (mail-dy1-f181.google.com [74.125.82.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3A323955D8 for ; Sat, 23 May 2026 09:44:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529481; cv=none; b=Ey95W21AMkpC0XDZJqbD3wyz+B6uWNkhc1PYdxvi0xzZOFhMn5L7dZRehAWU4ODnk2T5/XkF8mpjZSZNK1MnxmcQjQv5dbkb/XIMuePsGL0IZkxH3VuUm6SAuJuLcs1JMX6XWT07MUNVwc966wok0Cqi9e45m5xt+o33JDQt3dw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779529481; c=relaxed/simple; bh=rGyYFs3jOglOiCQGDDIKWaFC0nmtGat1P190AN9h5fs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=KueR/7qYfZFEkgfmoNEXEoHuBAEvHPBNGr4sn/7HLwSHaNTPyOz2hTjs/rEF+8xyWlTxGE1FKJB4GDwExYThCAKB0/w9h9podbrt+IH3DDgnuJHASp9o0rKEJ267k3KOEojk5qfWY4hbo7+U0comA6GAIALuVt25/aYnYaKxFHc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=R0Xduk1E; arc=none smtp.client-ip=74.125.82.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="R0Xduk1E" Received: by mail-dy1-f181.google.com with SMTP id 5a478bee46e88-3042a388168so1871494eec.1 for ; Sat, 23 May 2026 02:44:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779529477; x=1780134277; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5EJ1e1t3sKf6gHaiLnMHX3w0ckmHBpfIds8YySBmEtg=; b=R0Xduk1EFVFPhM3oLlEUQewjyj99hzu+pgBbKsM8qlkZGzwQdobcPwnCJWeWmih0uL DkDz/bX7hpg7BLlCkFY7XFutVPQxJW6Tf66YoXP+ePrXx7efZXLTKObNP0+r0fEWO1Pw yG5W/B1+O7qJ/sDiBabHd8GMCDk0N7U+9x8BmrNssPBu2Ymbj9CHZREoCel+nIIHMpyQ y9x05AlMJ+y2rAYyYzUaSttxpLUREj4r/3PX/rTiyduA2Z7m1H+mP+GiQdqmVb5CSobX NvP/qxSPVRYXnQnQPgnGieGAYUT1bCOT1gqR4D3hbrrDlm6VLVXFlstbySU8JtUvoa2p QksA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779529477; x=1780134277; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=5EJ1e1t3sKf6gHaiLnMHX3w0ckmHBpfIds8YySBmEtg=; b=MXYeI2T/NdydXCTIFzV/EO59xfVnDRBb4iC8APC0n+GihA3qzvucVS7/SBr710JUSp O0x4SOnsNQ27JbgLiX1OAbleS1eY1tciUZkyNQ0OnCFxh2MQvGsQiFWZ2IWI3YL8KsaS aTUUHfEhzljnYS2ojkMoPRaIvJVV8XAibL4D1G/Df3TcN5MeZzrxMaukhPRrKAKKywmJ P4bxLd89MTMYEkpBzM3LNTIPYn2XKmc8CcHdECeUzZuWitVhcaENivPSRG10/sxfmFuq Wd+iv5UHpdhcFAJTZ2WAXlS6PCbwcR2KELBGjvQDUoOmyP5k5YVxIbFX2HUHE4yq4RjQ OdUQ== X-Forwarded-Encrypted: i=1; AFNElJ/D84N48+tQBRdl5xIicuC2tozpo8omO/sRTln4E74DgzJ5Vt3eABcQFeTQsAGDmqMVOOkyXyBPHWcJ54o=@vger.kernel.org X-Gm-Message-State: AOJu0YwKFhgwYFE2HmZ77KoHnynrzR1wb8Rxyuf3q7fxLjoJ/FgUFszt 99PkaboePytZ2fqa65j6qv/P01HQmI0N9byHvDOu3IuBQ3rFIfSG4iDt X-Gm-Gg: Acq92OGkEP8rw2lnMs+vFrHlZG0t3T5BCEafkrRMWkLJGJ27d36/0bF54l/1DmyyAls yWdzghzhMa6E5dI1tfH9kA8ZxjIhK4ggXD4sfWCwpLSdDV1iWZ7BwkAHINdLFS9SWhipK8N/0C7 6XtMXlrKMOs17ZhFsjsdqLxQsCGl0rb59SMzT0SyzpCCL8iWf81YhdwD8PynV0tYlDAGGUApsRY XInJNe2waRgNqChTL8cick4S0ReKmxIN7ocJ+jrBM5szOfXQOLNUCguaKND7ol3j9genyXjdHB+ MR0yPV2VNazxg04yJvidYOEdbvRvNcl1SIN5zhMDqiAngWgnLnqqGZ7wbh4UO662MPpcX9Slvc8 k3KeglPcRNhtu1AtPA3Mn9liaCWRluJi0S3svk2aqaiwU3TIFPPDj+wm5NStWGsCKALxzpMX5hK tCRo5Ect8kO8Ct8kUdxxs0c6UnPgKAvy0uNqzkzjRfeATwPI7nUKoBQjEA3yjv+/ctfxXf8S+Jw eAAvQAAR4ivap8YEA== X-Received: by 2002:a05:7022:41a5:b0:12a:8122:24a9 with SMTP id a92af1059eb24-136341cd515mr4005427c88.22.1779529476598; Sat, 23 May 2026 02:44:36 -0700 (PDT) Received: from AnisaLaptop.localdomain (c-73-170-217-179.hsd1.ca.comcast.net. [73.170.217.179]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1366a40305csm2376358c88.7.2026.05.23.02.44.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 May 2026 02:44:36 -0700 (PDT) From: Anisa Su X-Google-Original-From: Anisa Su To: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Cc: nvdimm@lists.linux.dev, Dan Williams , Jonathan Cameron , Davidlohr Bueso , Dave Jiang , Vishal Verma , Ira Weiny , Alison Schofield , John Groves , Gregory Price , Anisa Su Subject: [PATCH v10 31/31] Documentation/cxl: Document DCD extent handling and DC-backed DAX regions Date: Sat, 23 May 2026 02:43:25 -0700 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Extend the CXL and DAX driver-api documentation to cover Dynamic Capacity Devices. cxl-driver.rst gains a "Dynamic Capacity Extents" section describing the conditions under which the CXL core accepts an offered extent (per-extent: region resolution, full ED-range containment, no-overlap, duplicate tolerance; per-tag-group: host-wide tag-uuid uniqueness, sequence-number integrity, partition equality, alignment) and the conditions under which a release request is honoured (DPA-range containment in some member, tag match, DAX-layer EBUSY deferral, whole-tag-group release). The host-wide uniqueness gate is enforced by the cxl_tag_register registry in drivers/cxl/core/extent.c. For sequence numbers the doc spells out both regimes =E2=80=94 device-stamped 1..n on sharable allocations and host-assigned arrival-order 1..n (via cxl_add_pending's logical_seq) on non-sharable allocations =E2=80=94 and notes that the DAX layer sees one unified 1..n dense invariant. dax-driver.rst gains a "Dynamic Capacity (DC) Regions" section that lays out the four-object layering device extent =E2=86=92 dc_extent = =E2=86=92 dax_resource =E2=86=92 DAX device, with cardinalities: one tagged allocation maps to one cxl_dc_tag_group containing N dc_extents and N dax_resources, claimed into one DAX device with N range entries in seq_num order; an untagged Add delivery becomes its own single-member group. Each dc_extent carries its own hpa_range =E2=80=94 there is no aggregated bounding-box range across siblings. Tag-based DAX device creation, DC-only sizing rules (no grow, size=3D0 to destroy), and the uuid attribute semantics are documented alongside. Signed-off-by: Anisa Su --- .../driver-api/cxl/linux/cxl-driver.rst | 149 ++++++++++++++++ .../driver-api/cxl/linux/dax-driver.rst | 167 ++++++++++++++++++ 2 files changed, 316 insertions(+) diff --git a/Documentation/driver-api/cxl/linux/cxl-driver.rst b/Documentat= ion/driver-api/cxl/linux/cxl-driver.rst index dd6dd17dc536..cb08fc536da8 100644 --- a/Documentation/driver-api/cxl/linux/cxl-driver.rst +++ b/Documentation/driver-api/cxl/linux/cxl-driver.rst @@ -619,6 +619,155 @@ from HPA to DPA. This is why they must be aware of t= he entire interleave set. Linux does not support unbalanced interleave configurations. As a result,= all endpoints in an interleave set must have the same ways and granularity. =20 +Dynamic Capacity Extents +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +A `Dynamic Capacity Device (DCD)` advertises capacity in `DC partitions` +and surfaces individual chunks of that capacity to the host as `extents`. +The device may add an extent at any time (a `pending add`) and may +request that a previously accepted extent be released (a `pending +release`). Each transition is mediated by a mailbox handshake whose +state machine the CXL driver enforces in +:code:`drivers/cxl/core/{mbox.c,extent.c}`. + +Extents that share a non-null tag form one logical allocation. Each +surviving member becomes its own :code:`struct dc_extent` (per-extent +sysfs device, per-extent HPA range); their containing tag group is an +internal-only :code:`struct cxl_dc_tag_group` keyed by UUID with no +sysfs identity. Each :code:`dc_extent` becomes one +:code:`dax_resource` on the DAX side, and a tagged DAX device is built +by claiming every :code:`dax_resource` that carries the tag. + +For DAX-side semantics =E2=80=94 how accepted extents materialize into +:code:`dax_resource` objects and DAX devices =E2=80=94 see +:doc:`dax-driver`. + +Accepting Extents +----------------- +Extents are made available to the host from the device through DC ADD even= ts. +Event records contain extents, which may be tagged or untagged, shared or +not shared. Multiple event records can by chained together by the `More` f= lag. + +The unit of allocation is a `tag`. All extents +sharing a tag form one allocation; the More flag is a delivery boundary +only, meaning when the More chain ends, the host can assume that all exten= ts +have been collected for each tag. +A tag may be the null UUID (an `untagged` allocation, valid in +non-sharable regions) or a non-null UUID identifying a sharable or +non-sharable allocation. + +When a `More`-terminated chain of pending adds closes, the driver +processes the pending list one tag group at a time. A group is +committed only if it passes every gate below; failing any gate drops +the entire group with a firmware-bug warning, and the dropped extents +do not appear in the :code:`ADD_DC_RESPONSE`. There is no +partial-extent acceptance =E2=80=94 either an offered extent is accepted w= hole +or it is dropped whole. + +Per-extent gates (applied in :code:`cxl_add_extent`, +:code:`drivers/cxl/core/extent.c`): + +* The extent's DPA range must resolve to a CXL region via + :code:`cxl_dpa_to_region()`. An extent with no owning region is + dropped; the device sees the omission from :code:`ADD_DC_RESPONSE`. +* The extent's DPA range must be `fully contained` in the endpoint + decoder's DPA range. An extent that straddles the decoder boundary + is rejected with :code:`-ENXIO`; the driver never clips an extent to + fit. +* The extent must not overlap an extent already present in the same + region. Overlap classification is done in + :code:`cxlr_dax_classify_extent()` using :code:`range_overlaps()`. + Exact duplicates of a previously-accepted range are tolerated =E2=80=94 + accepting the same range twice is a no-op, which simplifies + probe-time scans of the device's existing accepted list. + +Per-group gates (applied in :code:`cxl_add_pending`, +:code:`drivers/cxl/core/mbox.c`): + +* `Host-wide tag uniqueness`: a non-null tag must not already + correspond to a live :code:`cxl_dc_tag_group` anywhere on this host. + The orchestrator (FM) owns tag-UUID allocation per spec; the + registry in :code:`drivers/cxl/core/extent.c` + (:code:`cxl_tag_register` / :code:`cxl_tag_already_committed`) + catches firmware bugs and orchestrator misbehavior across every + region and memdev. Skipped for the null UUID, which has no + cross-chain identity. +* `Sequence-number integrity`: every member must carry the wire + field :code:`shared_extn_seq =3D=3D 0` (non-sharable allocation), or + the group's sorted sequence numbers must be exactly + :code:`1, 2, =E2=80=A6, n` (sharable allocation). Mixed, gapped, + duplicate, or non-zero-but-not-starting-at-1 sets are rejected. +* `Partition equality`: every tagged extent in the group must + resolve to the same DC partition. A single allocation cannot span + partitions because CDAT describes sharable / writable / coherency + attributes per-partition. Skipped for the null UUID. +* `Alignment`: every extent's :code:`start_dpa` and :code:`length` + must be :code:`CXL_DCD_EXTENT_ALIGN`-aligned. Partial acceptance + of an aligned subset would leave an unusable DAX device, so the + group is dropped instead. + +Surviving extents are sorted by the wire field +:code:`shared_extn_seq` =E2=80=94 stable, so arrival order is preserved for +the all-zero non-sharable case =E2=80=94 and each becomes a +:code:`dc_extent` inserted into a fresh :code:`cxl_dc_tag_group` +keyed by the group's UUID. Each :code:`dc_extent` carries its own +:code:`hpa_range`; the tag group itself has no aggregate range. + +As each surviving extent is attached the host assigns it a 1..n +:code:`seq_num`: for sharable allocations this equals the +device-stamped :code:`shared_extn_seq` directly; for non-sharable +allocations the device sends :code:`shared_extn_seq =3D=3D 0` and the +host fills in the arrival-order position (see :code:`logical_seq` in +:code:`cxl_add_pending`). The DAX layer enforces the same +:code:`1..n` dense invariant in both cases. + +The tag group is brought online via :code:`online_tag_group()`, +which registers every member :code:`dc_extent` as an +:code:`extentX.Y` child of :code:`cxlr_dax->dev`, the DAX layer is +notified with :code:`DCD_ADD_CAPACITY`, and the accepted extents are +spliced into the response list for a single :code:`ADD_DC_RESPONSE` +mailbox per More-chain. + +Releasing Extents +----------------- + +A release may be initiated by the device (a pending release +notification) or by the host (when destroying a DAX device or tearing +down a region). Both paths converge on :code:`cxl_rm_extent` +(:code:`drivers/cxl/core/extent.c`). + +Per-extent gates: + +* The DPA range must resolve to a CXL region. If it does not =E2=80=94 for + example, an extent left over from a host crash that has not yet + been re-claimed, or a duplicate release racing region teardown =E2=80=94 + the release is acknowledged via :code:`memdev_release_extent()` so + the device knows the host is not using the capacity, and the + operation returns :code:`-ENXIO`. +* The DPA range must be `fully contained` in some member + :code:`dc_extent`'s :code:`dpa_range` on the region's + :code:`cxlr_dax`, and the tag (UUID) on that member's + :code:`cxl_dc_tag_group` must match the release request. Releases + are keyed by :code:`(DPA range, tag)` rather than by pointer + because the device, not the host, supplies the identity. A + request that matches no :code:`dc_extent` is rejected with + :code:`-EINVAL`. + +If those gates pass, the DAX layer is notified with +:code:`DCD_RELEASE_CAPACITY` and consulted for permission to proceed. +If the DAX layer returns :code:`-EBUSY` =E2=80=94 the capacity is still ma= pped +or otherwise in use =E2=80=94 the release is deferred and +:code:`cxl_rm_extent` returns success without unregistering anything. +When the DAX layer ultimately grants release, +:code:`rm_tag_group()` invalidates the backing memregion once for the +whole group, then unregisters every member :code:`dc_extent` device, +which cascades through the DAX layer to drop the corresponding +:code:`dax_resource`\ s. + +The release path is always whole-tag-group: tagged allocations +release atomically, and the kernel does not split a group in response +to a sub-range release request. + Example Configurations =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D .. toctree:: diff --git a/Documentation/driver-api/cxl/linux/dax-driver.rst b/Documentat= ion/driver-api/cxl/linux/dax-driver.rst index 10d953a2167b..07f08396f639 100644 --- a/Documentation/driver-api/cxl/linux/dax-driver.rst +++ b/Documentation/driver-api/cxl/linux/dax-driver.rst @@ -27,6 +27,173 @@ CXL capacity in the task's page tables. Users wishing to manually handle allocation of CXL memory should use this interface. =20 +Dynamic Capacity (DC) Regions +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D +A region backed by a CXL `Dynamic Capacity Device (DCD)` is a `DC region`: +its HPA window is fixed at probe time, but the DPA capacity that fills the +window arrives and departs at runtime as the device offers and reclaims +`extents`. DC regions are distinguished from static regions by the +:code:`IORESOURCE_DAX_DCD` flag on the :code:`dax_region`. + +For the CXL-side rules governing when an offered extent is accepted or a +release request is honoured, see :doc:`cxl-driver`. This section covers +the DAX-side mapping between accepted extents and DAX devices. + +The Extent Layering Model +------------------------- +Four objects sit between the wire-level CXL extent and the +user-visible DAX device. Understanding the cardinality between them +is the key to the DC-region model. + +:: + + device extents dc_extent dax_resource DAX device + (CXL device) (CXL core) (DAX bus) (/dev/daxN= .Y) + ------------- ------------- ------------- ----------= -- + e1 =E2=94=80=E2=94=90 =E2=94=8C=E2=94=80=E2=96=BA dc_e1= =E2=94=80=E2=94=80=E2=96=BA res_1 (seq=3D1) =E2=94=80=E2=94=80=E2=94= =90 + e2 =E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=94=80 tag A =E2=94=80=E2=94= =80=E2=96=BA =E2=94=BC=E2=94=80=E2=96=BA dc_e2 =E2=94=80=E2=94=80=E2=96= =BA res_2 (seq=3D2) =E2=94=80=E2=94=80=E2=94=BC=E2=94=80=E2=94=80=E2=96= =BA daxN.0 + e3 =E2=94=80=E2=94=98 =E2=94=94=E2=94=80=E2=96=BA dc_e3= =E2=94=80=E2=94=80=E2=96=BA res_3 (seq=3D3) =E2=94=80=E2=94=80=E2=94= =98 (claimed by tag A, + size = =3D =CE=A3 |e_i|) + + e4 =E2=94=80=E2=94=80=E2=94=80 tag B =E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=96=BA dc_e4 =E2=94=80=E2=94=80=E2=96=BA res_4 (seq=3D1) = =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=96=BA daxN.1 + + e5 =E2=94=80=E2=94=80=E2=94=80 null tag =E2=94=80=E2=96=BA dc_e5 = =E2=94=80=E2=94=80=E2=96=BA res_5 (seq=3D0) =E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=96=BA daxN.2 + e6 =E2=94=80=E2=94=80=E2=94=80 null tag =E2=94=80=E2=96=BA dc_e6 = =E2=94=80=E2=94=80=E2=96=BA res_6 (seq=3D0) =E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=96=BA daxN.3 + +The CXL core groups extents sharing a non-null tag into a single +:code:`cxl_dc_tag_group` (internal-only, no sysfs identity), but each +member extent stays a distinct :code:`dc_extent` with its own HPA +range. The DAX bridge creates one :code:`dax_resource` per +:code:`dc_extent`, and userspace claims a DAX device by writing the +tag's UUID to the seed device's :code:`uuid` attribute, which carves +every matching :code:`dax_resource` (in :code:`seq_num` order) into +the device's :code:`ranges[]` array. + +`Device extent` + The unit the CXL device delivers over the mailbox: a + :code:`(DPA, length, tag, shared_extn_seq)` tuple inside an + Add-Capacity event. The tag is either a non-null UUID (a + `tagged allocation`) or the null UUID (`untagged`). + +:code:`dc_extent` + The CXL core's per-extent object, one per surviving device extent. + Each :code:`dc_extent` is registered as its own :code:`extentX.Y` + sysfs device under :code:`cxlr_dax->dev` and carries its own + :code:`hpa_range` =E2=80=94 there is no aggregated / bounding-box HPA + range across siblings. Members of one tag group point at a + shared :code:`cxl_dc_tag_group` (which holds the UUID and a + manual refcount on the surviving siblings) but otherwise exist as + independent kernel objects. + + For a `non-null tag`, the host-wide tag-uniqueness gate + (:doc:`cxl-driver`) guarantees there is at most one + :code:`cxl_dc_tag_group` per UUID on the host, so the set of + :code:`dc_extent`\ s sharing that UUID is a single allocation. + + For the `null tag` there is no cross-event identity =E2=80=94 the spec is + silent on aggregating untagged extents across Add-Capacity events. + Each untagged device extent becomes its own :code:`dc_extent` in + its own single-member tag group; two untagged extents delivered + separately are two distinct allocations. + +:code:`dax_resource` + The DAX bus's per-extent view, one-to-one with :code:`dc_extent`. + When the CXL DAX driver receives a :code:`DCD_ADD_CAPACITY` + notification it iterates the tag group and calls + :code:`dax_region_add_resource()` once per member, creating one + :code:`dax_resource` per :code:`dc_extent`. Each + :code:`dax_resource` carries that member's HPA range, the tag + UUID (copied from :code:`dc_extent->group->uuid`), and a 1..n + :code:`seq_num` so :code:`uuid_claim_tagged` can carve the matched + set into the device's :code:`ranges[]` array in the right order + (see :code:`drivers/dax/bus.c`). + +`DAX device` (:code:`/dev/daxN.Y`) + Created by userspace claiming a set of :code:`dax_resource`\ s via + the :code:`uuid` sysfs attribute. Each DAX device corresponds to + exactly one allocation: + + * A `tagged` DAX device is built from every :code:`dax_resource` + carrying the tag =E2=80=94 one per :code:`dc_extent` in the allocation + =E2=80=94 carved into the device's :code:`ranges[]` in :code:`seq_num` + order. Its size equals the sum of every member's size. + * An `untagged` DAX device is built from one untagged + :code:`dax_resource` and its size equals that one extent. + +So the end-to-end rule is: **one tagged allocation =3D one +cxl_dc_tag_group =3D N dc_extents =3D N dax_resources =3D one DAX device +with N range entries**. An untagged device extent becomes its own +:code:`dc_extent` / :code:`dax_resource` / single-range DAX device, +claimed one at a time. + +Release follows the same layering in reverse. When the CXL core +calls :code:`rm_tag_group()` (after the device asks for release and +the DAX layer consents), the DAX bridge collects every matching +:code:`dax_resource` and removes them as a set via +:code:`dax_region_rm_resources()`. The removal is refuse-all-or-none +under :code:`dax_region_rwsem`: if any member is in use, the whole +group stays. When removal commits, the HPA capacity returns to the +region's free pool and any DAX device that had claimed it is left +with no backing capacity. Userspace tears the DAX device down via +:code:`daxctl destroy-device` (size=3D0, then write the device name to +the region's :code:`delete` attribute). + +UUID-Based DAX Device Creation +------------------------------ +A DAX device on a DC region is created by writing a UUID to the +seed device's :code:`uuid` attribute +(:code:`/sys/bus/dax/devices/daxN.Y/uuid`). The seed starts at +size 0; writing :code:`uuid` is a `claim` operation that resolves +the layering above and populates the device: + +* A `non-null UUID` claims `every` :code:`dax_resource` whose tag + matches. :code:`uuid_claim_tagged` (in + :code:`drivers/dax/bus.c`) collects them, sorts by + :code:`seq_num`, enforces the dense :code:`1..n` invariant, and + carves each via :code:`__dev_dax_resize` in :code:`seq_num` order + so the device's :code:`ranges[]` array is dense and ordered. + The resulting DAX device represents exactly the tagged + allocation: its size equals the sum of every member extent's + size. + + The dense :code:`1..n` invariant is the unified rule the CXL + side maintains for both sharable and non-sharable allocations + (see :doc:`cxl-driver`); the match set has exactly one entry per + :code:`dc_extent` in the tag group. + +* The value :code:`"0"` is shorthand for the null UUID and claims + exactly `one` untagged :code:`dax_resource`. Untagged + :code:`dax_resource`\ s correspond to independent untagged + allocations; collapsing several into one device would aggregate + unrelated capacity, so each :code:`uuid` write consumes a single + untagged resource. + +* A write that matches no :code:`dax_resource` returns + :code:`-ENOENT` and the device remains at size 0. + +* Writes to the :code:`uuid` attribute on non-DC regions return + :code:`-EOPNOTSUPP`; the attribute itself is read-only (0444) on + non-DC devices. + +The device's size is determined entirely by the backing allocation: +users do not choose a size on DC regions. Accordingly, the +:code:`size` attribute on a DC DAX device rejects grow requests +with :code:`-EOPNOTSUPP`. Writing :code:`0` is still permitted and is +how :code:`daxctl destroy-device` returns each claimed extent to the +region's available pool before the device's name is written to the +region's :code:`delete` attribute. + +Reads of :code:`uuid` report the tag identifying the capacity +backing the device: + +* For a non-null-UUID-claimed DC DAX device, :code:`uuid` reads + back the claimed UUID. +* For a DC DAX device claimed via :code:`"0"`, or for any + non-DCD DAX device, :code:`uuid` reads :code:`0`. + +See :code:`Documentation/ABI/testing/sysfs-bus-dax` for the +authoritative attribute contracts. + kmem conversion =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The :code:`dax_kmem` driver converts a `DAX Device` into a series of `hotp= lug --=20 2.43.0