From nobody Sun Feb 8 18:37:40 2026 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 585EF2D04E for ; Sun, 12 May 2024 18:38:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715539129; cv=none; b=pbHqM74PHnh7C8viQhONCG4+KARLmmd/C8PP6X7X3qFPzU+1S5M3mr8a4dxNhqCss3k3GQSsWxbxtQAP3R3XsoFRPZSL3c/8F9VMbMP+wGjHiXiuFVGn4n+V30Udxn/VPxZYFXWNNsAdSLPv5V25Zfkeyr9k42ji80F7drrL7ow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715539129; c=relaxed/simple; bh=+edHUCQHKHFgEwWs6CquSk6/h4HGneRxmMfRYi5lyrw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TG08ykkstS65j0eilSSn6ZTQiBsBxdd+mD60y1YuxbAIx3E2ur8/lDETNHOhTUbPL54NNDSW3SvZgvLSx0rzaBgweKlAqDz0uEpqGIc+nn8jPIGJ5qsUkxBbpQurWza5GzBPcfmIbuRtHI47w09HW8JZg+fqTsSD+6TG/IT82hw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=metaspace.dk; spf=none smtp.mailfrom=metaspace.dk; dkim=pass (2048-bit key) header.d=metaspace-dk.20230601.gappssmtp.com header.i=@metaspace-dk.20230601.gappssmtp.com header.b=Gz7iouzC; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=metaspace.dk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=metaspace.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=metaspace-dk.20230601.gappssmtp.com header.i=@metaspace-dk.20230601.gappssmtp.com header.b="Gz7iouzC" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1ed41eb3382so26717505ad.0 for ; Sun, 12 May 2024 11:38:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=metaspace-dk.20230601.gappssmtp.com; s=20230601; t=1715539126; x=1716143926; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Kpf4DmBcDoudkTwElozSfipfroGuTDPdnUmyzuCt7rw=; b=Gz7iouzCsC7+OXfwltX7KK1AeurInHTtbXTejkhwmcfpVXPtBVzRolX6EUbSaMIO1v ENemVGDCxnfO8DfGbVIQRU2LwMgd3YErAhoEJSoBiI9SprspOXrhHpIE/AR8ThBuwGrq 3pHMLmtmceafR2nNq7TQkuknMvITQlvevKSluqD+8+Is3uBwKEuXxvKoiiNbP4s7ImCj eqFY4diwneXga5POeMem+VFp09TDwwt5lxTZ5W1JOKep6R7E7Uyu5poC4KFvVN6RrmyT rhqhBjrzZSSnTdjhIulcDZLwtPKCfGpPMzrXrkAacj+j31p4kmBwLNDV3lP9RQ7p/SFj PJrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715539126; x=1716143926; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Kpf4DmBcDoudkTwElozSfipfroGuTDPdnUmyzuCt7rw=; b=JzFNEm24Ss5FZjv6mB/2qprSu4soLO5Dod3IEoV2bWa7ECdsobEsmRx8WbGwYKApId 9FGq9tMZONCgztjvoZOdFSEFroskLv6/dbgHlfNeLXaHu4Om2pwIFmo23kknSnu2sEfB 2puHY8B650jDyOdkS9/sMgh3mrRdGYLJravTcZ6mz1r4vkoCesOl2JW8T9RhMllfp+6M xgdmsACp2XXCTUdjWd92O1Qys8N31r3VMs9RbNq8FNAUaHPdbRPP/EbK1j2+0/5Y2ZB/ m3eStyW/Q7dFds7cis2ADDuTnbtgHU0WA8kHJJiThwhkXf2FhGJ216zjbh+tTFwPPOjf u4Nw== X-Forwarded-Encrypted: i=1; AJvYcCU80K2qEYp3ZxB0bpTFkgGQ9GbEnRrG3A1jmzpFUDB+NnjAqHtztKHBJykjDHe3aBTChQSKJ+1fkJM/5/EaHNDM/rCWzVuhjptVZ/FB X-Gm-Message-State: AOJu0YzPPjR74L7f02A1eMvryZpM0/oFjgldV8atnK+XzXEmR4AYWmxI 83fDCKoSjU6uW5zbk/QMcPqMwYJVWiB0e6JYyee+3NgbWCXPY16xdwddBvk0rp4= X-Google-Smtp-Source: AGHT+IEX9z9OwuTyNAMY9YLm7yOyrM10E0POxNXSRC0SmBgJM0Hk828hYlo49vUt/mnRIelIx37RCA== X-Received: by 2002:a17:903:2444:b0:1eb:e5c0:6459 with SMTP id d9443c01a7336-1ef43c095a6mr134544705ad.8.1715539124581; Sun, 12 May 2024 11:38:44 -0700 (PDT) Received: from localhost ([50.204.89.33]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ef0bad6176sm65386285ad.76.2024.05.12.11.38.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 May 2024 11:38:43 -0700 (PDT) From: Andreas Hindborg To: Jens Axboe , Christoph Hellwig , Keith Busch , Damien Le Moal , Bart Van Assche , Hannes Reinecke , Ming Lei , "linux-block@vger.kernel.org" Cc: Andreas Hindborg , Wedson Almeida Filho , Greg KH , Matthew Wilcox , Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , =?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?= , Benno Lossin , Alice Ryhl , Chaitanya Kulkarni , Luis Chamberlain , Yexuan Yang <1182282462@bupt.edu.cn>, =?UTF-8?q?Sergio=20Gonz=C3=A1lez=20Collado?= , Joel Granados , "Pankaj Raghav (Samsung)" , Daniel Gomez , Niklas Cassel , Philipp Stanner , Conor Dooley , Johannes Thumshirn , =?UTF-8?q?Matias=20Bj=C3=B8rling?= , open list , "rust-for-linux@vger.kernel.org" , "lsf-pc@lists.linux-foundation.org" , "gost.dev@samsung.com" Subject: [PATCH 1/3] rust: block: introduce `kernel::block::mq` module Date: Sun, 12 May 2024 12:39:46 -0600 Message-ID: <20240512183950.1982353-2-nmi@metaspace.dk> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240512183950.1982353-1-nmi@metaspace.dk> References: <20240512183950.1982353-1-nmi@metaspace.dk> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Andreas Hindborg Add initial abstractions for working with blk-mq. This patch is a maintained, refactored subset of code originally published by Wedson Almeida Filho [1]. [1] https://github.com/wedsonaf/linux/tree/f2cfd2fe0e2ca4e90994f96afe268bbd= 4382a891/rust/kernel/blk/mq.rs Cc: Wedson Almeida Filho Signed-off-by: Andreas Hindborg --- rust/bindings/bindings_helper.h | 2 + rust/helpers.c | 16 ++ rust/kernel/block.rs | 5 + rust/kernel/block/mq.rs | 109 +++++++++++++ rust/kernel/block/mq/gen_disk.rs | 205 ++++++++++++++++++++++++ rust/kernel/block/mq/operations.rs | 245 +++++++++++++++++++++++++++++ rust/kernel/block/mq/raw_writer.rs | 55 +++++++ rust/kernel/block/mq/request.rs | 227 ++++++++++++++++++++++++++ rust/kernel/block/mq/tag_set.rs | 93 +++++++++++ rust/kernel/error.rs | 5 + rust/kernel/lib.rs | 2 + 11 files changed, 964 insertions(+) create mode 100644 rust/kernel/block.rs create mode 100644 rust/kernel/block/mq.rs create mode 100644 rust/kernel/block/mq/gen_disk.rs create mode 100644 rust/kernel/block/mq/operations.rs create mode 100644 rust/kernel/block/mq/raw_writer.rs create mode 100644 rust/kernel/block/mq/request.rs create mode 100644 rust/kernel/block/mq/tag_set.rs diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helpe= r.h index 65b98831b975..b45000342be3 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -7,6 +7,8 @@ */ =20 #include +#include +#include #include #include #include diff --git a/rust/helpers.c b/rust/helpers.c index 70e59efd92bc..f151d6b3fdfb 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -178,3 +178,19 @@ static_assert( __alignof__(size_t) =3D=3D __alignof__(uintptr_t), "Rust code expects C `size_t` to match Rust `usize`" ); + +// This will soon be moved to a separate file, so no need to merge with ab= ove. +#include +#include + +void *rust_helper_blk_mq_rq_to_pdu(struct request *rq) +{ + return blk_mq_rq_to_pdu(rq); +} +EXPORT_SYMBOL_GPL(rust_helper_blk_mq_rq_to_pdu); + +struct request *rust_helper_blk_mq_rq_from_pdu(void *pdu) +{ + return blk_mq_rq_from_pdu(pdu); +} +EXPORT_SYMBOL_GPL(rust_helper_blk_mq_rq_from_pdu); diff --git a/rust/kernel/block.rs b/rust/kernel/block.rs new file mode 100644 index 000000000000..150f710efe5b --- /dev/null +++ b/rust/kernel/block.rs @@ -0,0 +1,5 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Types for working with the block layer. + +pub mod mq; diff --git a/rust/kernel/block/mq.rs b/rust/kernel/block/mq.rs new file mode 100644 index 000000000000..238387f1ab31 --- /dev/null +++ b/rust/kernel/block/mq.rs @@ -0,0 +1,109 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module provides types for implementing block drivers that interfa= ce the +//! blk-mq subsystem. +//! +//! To implement a block device driver, a Rust module must do the followin= g: +//! +//! - Implement [`Operations`] for a type `T` +//! - Create a [`TagSet`] +//! - Create a [`GenDisk`], passing in the `TagSet` reference +//! - Add the disk to the system by calling [`GenDisk::add`] +//! +//! The types available in this module that have direct C counterparts are: +//! +//! - The `TagSet` type that abstracts the C type `struct tag_set`. +//! - The `GenDisk` type that abstracts the C type `struct gendisk`. +//! - The `Request` type that abstracts the C type `struct request`. +//! +//! Many of the C types that this module abstracts allow a driver to carry +//! private data, either embedded in the struct directly, or as a C `void*= `. In +//! these abstractions, this data is typed. The types of the data is defin= ed by +//! associated types in `Operations`, see [`Operations::RequestData`] for = an +//! example. +//! +//! The kernel will interface with the block device driver by calling the = method +//! implementations of the `Operations` trait. +//! +//! IO requests are passed to the driver as [`Request`] references. The +//! `Request` type is a wrapper around the C `struct request`. The driver = must +//! mark start of request processing by calling [`Request::start`] and end= of +//! processing by calling one of the [`Request::end`], methods. Failure to= do so +//! can lead to deadlock or timeout errors. +//! +//! The `TagSet` is responsible for creating and maintaining a mapping bet= ween +//! `Request`s and integer ids as well as carrying a pointer to the vtable +//! generated by `Operations`. This mapping is useful for associating +//! completions from hardware with the correct `Request` instance. The `Ta= gSet` +//! determines the maximum queue depth by setting the number of `Request` +//! instances available to the driver, and it determines the number of que= ues to +//! instantiate for the driver. If possible, a driver should allocate one = queue +//! per core, to keep queue data local to a core. +//! +//! One `TagSet` instance can be shared between multiple `GenDisk` instanc= es. +//! This can be useful when implementing drivers where one piece of hardwa= re +//! with one set of IO resources are represented to the user as multiple d= isks. +//! +//! One significant difference between block device drivers implemented wi= th +//! these Rust abstractions and drivers implemented in C, is that the Rust +//! drivers have to own a reference count on the `Request` type when the I= O is +//! in flight. This is to ensure that the C `struct request` instances bac= king +//! the Rust `Request` instances are live while the Rust driver holds a +//! reference to the `Request`. In addition, the conversion of an integer = tag to +//! a `Request` via the `TagSet` would not be sound without this bookkeepi= ng. +//! +//! # =E2=9A=A0 Note +//! +//! For Rust block device drivers, the point in time where a request is fr= eed +//! and made available for recycling is usually at the point in time when = the +//! last `ARef` is dropped. For C drivers, this event usually occ= urs +//! when `bindings::blk_mq_end_request` is called. +//! +//! # Example +//! +//! ```rust +//! use kernel::{ +//! block::mq::*, +//! new_mutex, +//! prelude::*, +//! sync::{Arc, Mutex}, +//! types::{ARef, ForeignOwnable}, +//! }; +//! +//! struct MyBlkDevice; +//! +//! #[vtable] +//! impl Operations for MyBlkDevice { +//! +//! fn queue_rq(rq: ARef>, _is_last: bool) -> Result { +//! Request::end_ok(rq); +//! Ok(()) +//! } +//! +//! fn commit_rqs( +//! ) { +//! } +//! +//! fn complete(rq: ARef>) { +//! Request::end_ok(rq); +//! } +//! } +//! +//! let tagset: Arc> =3D Arc::pin_init(TagSet::try_new= (1, 256, 1))?; +//! let mut disk =3D gen_disk::try_new(tagset)?; +//! disk.set_name(format_args!("myblk"))?; +//! disk.set_capacity_sectors(4096); +//! disk.add()?; +//! +//! # Ok::<(), kernel::error::Error>(()) +//! ``` + +pub mod gen_disk; +mod operations; +mod raw_writer; +mod request; +mod tag_set; + +pub use operations::Operations; +pub use request::Request; +pub use tag_set::TagSet; diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_di= sk.rs new file mode 100644 index 000000000000..0cabbdbedb06 --- /dev/null +++ b/rust/kernel/block/mq/gen_disk.rs @@ -0,0 +1,205 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Generic disk abstraction. +//! +//! C header: [`include/linux/blkdev.h`](srctree/include/linux/blkdev.h) +//! C header: [`include/linux/blk_mq.h`](srctree/include/linux/blk_mq.h) + +use crate::block::mq::{raw_writer::RawWriter, Operations, TagSet}; +use crate::{bindings, error::from_err_ptr, error::Result, sync::Arc}; +use core::fmt::{self, Write}; +use core::marker::PhantomData; + +/// A generic block device. +/// +/// # Invariants +/// +/// - `gendisk` must always point to an initialized and valid `struct gen= disk`. +pub struct GenDisk { + _tagset: Arc>, + gendisk: *mut bindings::gendisk, + _phantom: core::marker::PhantomData, +} + +// SAFETY: `GenDisk` is an owned pointer to a `struct gendisk` and an `Arc= ` to a +// `TagSet` It is safe to send this to other threads as long as T is Send. +unsafe impl Send for GenDisk = {} + +/// Disks in this state are allocated and initialized, but are not yet +/// accessible from the kernel VFS. +pub enum Initialized {} + +/// Disks in this state have been attached to the kernel VFS and may recei= ve IO +/// requests. +pub enum Added {} + +/// Typestate representing states of a `GenDisk`. +pub trait GenDiskState {} + +impl GenDiskState for Initialized {} +impl GenDiskState for Added {} + +impl GenDisk { + /// Register the device with the kernel. When this function returns, t= he + /// device is accessible from VFS. The kernel may issue reads to the d= evice + /// during registration to discover partition information. + pub fn add(self) -> Result> { + crate::error::to_result( + // SAFETY: By type invariant, `self.gendisk` points to a valid= and + // initialized instance of `struct gendisk` + unsafe { + bindings::device_add_disk( + core::ptr::null_mut(), + self.gendisk, + core::ptr::null_mut(), + ) + }, + )?; + + // We don't want to run the destuctor and remove the device from t= he VFS + // when `disk` is dropped. + let mut old =3D core::mem::ManuallyDrop::new(self); + + let new =3D GenDisk { + _tagset: old._tagset.clone(), + gendisk: old.gendisk, + _phantom: PhantomData, + }; + + // But we have to drop the `Arc` or it will leak. + // SAFETY: `old._tagset` is valid for write, aligned, non-null, an= d we + // have exclusive access. We are not accessing the value again aft= er it + // is dropped. + unsafe { core::ptr::drop_in_place(&mut old._tagset) }; + + Ok(new) + } + + /// Set the name of the device. + pub fn set_name(&mut self, args: fmt::Arguments<'_>) -> Result { + let mut raw_writer =3D RawWriter::from_array( + // SAFETY: By type invariant `self.gendisk` points to a valid = and + // initialized instance. We have exclusive access, since the d= isk is + // not added to the VFS yet. + unsafe { &mut (*self.gendisk).disk_name }, + )?; + raw_writer.write_fmt(args)?; + raw_writer.write_char('\0')?; + Ok(()) + } + + /// Set the logical block size of the device. + /// + /// This is the smallest unit the storage device can address. It is + /// typically 512 bytes. + pub fn set_queue_logical_block_size(&mut self, size: u32) { + // SAFETY: By type invariant, `self.gendisk` points to a valid and + // initialized instance of `struct gendisk`. + unsafe { bindings::blk_queue_logical_block_size((*self.gendisk).qu= eue, size) }; + } + + /// Set the physical block size of the device. + /// + /// This is the smallest unit a physical storage device can write + /// atomically. It is usually the same as the logical block size but m= ay be + /// bigger. One example is SATA drives with 4KB sectors that expose a + /// 512-byte logical block size to the operating system. + pub fn set_queue_physical_block_size(&mut self, size: u32) { + // SAFETY: By type invariant, `self.gendisk` points to a valid and + // initialized instance of `struct gendisk`. + unsafe { bindings::blk_queue_physical_block_size((*self.gendisk).q= ueue, size) }; + } +} + +impl GenDisk { + /// Call to tell the block layer the capacity of the device in sectors= (512B). + pub fn set_capacity_sectors(&self, sectors: u64) { + // SAFETY: By type invariant, `self.gendisk` points to a valid and + // initialized instance of `struct gendisk`. Callee takes a lock to + // synchronize this operation, so we will not race. + unsafe { bindings::set_capacity(self.gendisk, sectors) }; + } + + /// Set the rotational media attribute for the device. + pub fn set_rotational(&self, rotational: bool) { + if !rotational { + // SAFETY: By type invariant, `self.gendisk` points to a valid= and + // initialized instance of `struct gendisk`. This operation us= es a + // relaxed atomic bit flip operation, so there is no race on t= his + // field. + unsafe { + bindings::blk_queue_flag_set(bindings::QUEUE_FLAG_NONROT, = (*self.gendisk).queue) + }; + } else { + // SAFETY: By type invariant, `self.gendisk` points to a valid= and + // initialized instance of `struct gendisk`. This operation us= es a + // relaxed atomic bit flip operation, so there is no race on t= his + // field. + unsafe { + bindings::blk_queue_flag_clear(bindings::QUEUE_FLAG_NONROT= , (*self.gendisk).queue) + }; + } + } +} + +impl Drop for GenDisk { + fn drop(&mut self) { + // TODO: This will `WARN` if the disk was not added. Since we cann= ot + // specialize drop, we have to call it, or track state with a flag. + + // SAFETY: By type invariant, `self.gendisk` points to a valid and + // initialized instance of `struct gendisk` + unsafe { bindings::del_gendisk(self.gendisk) }; + } +} + +/// Try to create a new `GenDisk`. +pub fn try_new(tagset: Arc>) -> Result> { + let lock_class_key =3D crate::sync::LockClassKey::new(); + + // SAFETY: `tagset.raw_tag_set()` points to a valid and initialized ta= g set + let gendisk =3D from_err_ptr(unsafe { + bindings::__blk_mq_alloc_disk( + tagset.raw_tag_set(), + core::ptr::null_mut(), // TODO: We can pass queue limits right= here + core::ptr::null_mut(), + lock_class_key.as_ptr(), + ) + })?; + + const TABLE: bindings::block_device_operations =3D bindings::block_dev= ice_operations { + submit_bio: None, + open: None, + release: None, + ioctl: None, + compat_ioctl: None, + check_events: None, + unlock_native_capacity: None, + getgeo: None, + set_read_only: None, + swap_slot_free_notify: None, + report_zones: None, + devnode: None, + alternative_gpt_sector: None, + get_unique_id: None, + // TODO: Set to THIS_MODULE. Waiting for const_refs_to_static feat= ure to + // be merged (unstable in rustc 1.78 which is ataged for linux 9.1= 0) + // https://github.com/rust-lang/rust/issues/119618 + owner: core::ptr::null_mut(), + pr_ops: core::ptr::null_mut(), + free_disk: None, + poll_bio: None, + }; + + // SAFETY: gendisk is a valid pointer as we initialized it above + unsafe { (*gendisk).fops =3D &TABLE }; + + // INVARIANT: `gendisk` was initialized above. + // INVARIANT: `gendisk.queue.queue_data` is set to `data` in the call = to + // `__blk_mq_alloc_disk` above. + Ok(GenDisk { + _tagset: tagset, + gendisk, + _phantom: PhantomData, + }) +} diff --git a/rust/kernel/block/mq/operations.rs b/rust/kernel/block/mq/oper= ations.rs new file mode 100644 index 000000000000..3bd1af2c2260 --- /dev/null +++ b/rust/kernel/block/mq/operations.rs @@ -0,0 +1,245 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module provides an interface for blk-mq drivers to implement. +//! +//! C header: [`include/linux/blk-mq.h`](srctree/include/linux/blk-mq.h) + +use crate::{ + bindings, + block::mq::request::RequestDataWrapper, + block::mq::Request, + error::{from_result, Result}, + types::ARef, +}; +use core::{marker::PhantomData, sync::atomic::AtomicU64, sync::atomic::Ord= ering}; + +/// Implement this trait to interface blk-mq as block devices. +/// +/// To implement a block device driver, implement this trait as described = in the +/// [module level documentation]. The kernel will use the implementation o= f the +/// functions defined in this trait to interface a block device driver. No= te: +/// There is no need for an exit_request() implementation, because the `dr= op` +/// implementation of the [`Request`] type will be invoked by automaticall= y by +/// the C/Rust glue logic. +/// +/// [module level documentation]: kernel::block::mq +#[macros::vtable] +pub trait Operations: Sized { + /// Called by the kernel to queue a request with the driver. If `is_la= st` is + /// `false`, the driver is allowed to defer committing the request. + fn queue_rq(rq: ARef>, is_last: bool) -> Result; + + /// Called by the kernel to indicate that queued requests should be su= bmitted. + fn commit_rqs(); + + /// Called by the kernel when the request is completed. + fn complete(_rq: ARef>); + + /// Called by the kernel to poll the device for completed requests. On= ly + /// used for poll queues. + fn poll() -> bool { + crate::build_error(crate::error::VTABLE_DEFAULT_ERROR) + } +} + +/// A vtable for blk-mq to interact with a block device driver. +/// +/// A `bindings::blk_mq_opa` vtable is constructed from pointers to the `e= xtern +/// "C"` functions of this struct, exposed through the `OperationsVTable::= VTABLE`. +/// +/// For general documentation of these methods, see the kernel source +/// documentation related to `struct blk_mq_operations` in +/// [`include/linux/blk-mq.h`]. +/// +/// [`include/linux/blk-mq.h`]: srctree/include/linux/blk-mq.h +pub(crate) struct OperationsVTable(PhantomData); + +impl OperationsVTable { + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// - The caller of this function must ensure `bd` is valid + /// and initialized. The pointees must outlive this function. + /// - This function must not be called with a `hctx` for which + /// `Self::exit_hctx_callback()` has been called. + /// - (*bd).rq must point to a valid `bindings:request` for which + /// `OperationsVTable::init_request_callback` was called + unsafe extern "C" fn queue_rq_callback( + _hctx: *mut bindings::blk_mq_hw_ctx, + bd: *const bindings::blk_mq_queue_data, + ) -> bindings::blk_status_t { + // SAFETY: `bd.rq` is valid as required by the safety requirement = for + // this function. + let request =3D unsafe { &*(*bd).rq.cast::>() }; + + // One refcount for the ARef, one for being in flight + request.wrapper_ref().refcount().store(2, Ordering::Relaxed); + + let rq =3D + // SAFETY: We own a refcount that we took above. We pass that to `= ARef`. + // By the safety requirements of this function, `request` is a val= id + // `struct request` and the private data is properly initialized. + unsafe {Request::aref_from_raw((*bd).rq)}; + + // SAFETY: We have exclusive access and we just set the refcount a= bove. + unsafe { Request::start_unchecked(&rq) }; + + let ret =3D T::queue_rq( + rq, + // SAFETY: `bd` is valid as required by the safety requirement= for this function. + unsafe { (*bd).last }, + ); + + if let Err(e) =3D ret { + e.to_blk_status() + } else { + bindings::BLK_STS_OK as _ + } + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. + unsafe extern "C" fn commit_rqs_callback(_hctx: *mut bindings::blk_mq_= hw_ctx) { + T::commit_rqs() + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. `rq` = must + /// point to a valid request that has been marked as completed. The po= intee + /// of `rq` must be valid for write for the duration of this function. + unsafe extern "C" fn complete_callback(rq: *mut bindings::request) { + // SAFETY: This function can only be dispatched through + // `Request::complete`. We leaked a refcount then which we pick ba= ck up + // now. + let aref =3D unsafe { Request::aref_from_raw(rq) }; + T::complete(aref); + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. + unsafe extern "C" fn poll_callback( + _hctx: *mut bindings::blk_mq_hw_ctx, + _iob: *mut bindings::io_comp_batch, + ) -> core::ffi::c_int { + T::poll().into() + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. This + /// function may only be called onece before `exit_hctx_callback` is c= alled + /// for the same context. + unsafe extern "C" fn init_hctx_callback( + _hctx: *mut bindings::blk_mq_hw_ctx, + _tagset_data: *mut core::ffi::c_void, + _hctx_idx: core::ffi::c_uint, + ) -> core::ffi::c_int { + from_result(|| Ok(0)) + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. + unsafe extern "C" fn exit_hctx_callback( + _hctx: *mut bindings::blk_mq_hw_ctx, + _hctx_idx: core::ffi::c_uint, + ) { + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. `set`= must + /// point to an initialized `TagSet`. + unsafe extern "C" fn init_request_callback( + _set: *mut bindings::blk_mq_tag_set, + rq: *mut bindings::request, + _hctx_idx: core::ffi::c_uint, + _numa_node: core::ffi::c_uint, + ) -> core::ffi::c_int { + from_result(|| { + // SAFETY: The `blk_mq_tag_set` invariants guarantee that all + // requests are allocated with extra memory for the request da= ta. + let pdu =3D unsafe { bindings::blk_mq_rq_to_pdu(rq) }.cast::(); + + // SAFETY: The refcount field is allocated but not initialized= , this + // valid for write. + unsafe { RequestDataWrapper::refcount_ptr(pdu).write(AtomicU64= ::new(0)) }; + + Ok(0) + }) + } + + /// This function is called by the C kernel. A pointer to this functio= n is + /// installed in the `blk_mq_ops` vtable for the driver. + /// + /// # Safety + /// + /// This function may only be called by blk-mq C infrastructure. `rq` = must + /// point to a request that was initialized by a call to + /// `Self::init_request_callback`. + unsafe extern "C" fn exit_request_callback( + _set: *mut bindings::blk_mq_tag_set, + rq: *mut bindings::request, + _hctx_idx: core::ffi::c_uint, + ) { + // SAFETY: The tagset invariants guarantee that all requests are a= llocated with extra memory + // for the request data. + let pdu =3D unsafe { bindings::blk_mq_rq_to_pdu(rq) }.cast::(); + + // SAFETY: `pdu` is valid for read and write and is properly initi= alised. + unsafe { core::ptr::drop_in_place(pdu) }; + } + + const VTABLE: bindings::blk_mq_ops =3D bindings::blk_mq_ops { + queue_rq: Some(Self::queue_rq_callback), + queue_rqs: None, + commit_rqs: Some(Self::commit_rqs_callback), + get_budget: None, + put_budget: None, + set_rq_budget_token: None, + get_rq_budget_token: None, + timeout: None, + poll: if T::HAS_POLL { + Some(Self::poll_callback) + } else { + None + }, + complete: Some(Self::complete_callback), + init_hctx: Some(Self::init_hctx_callback), + exit_hctx: Some(Self::exit_hctx_callback), + init_request: Some(Self::init_request_callback), + exit_request: Some(Self::exit_request_callback), + cleanup_rq: None, + busy: None, + map_queues: None, + #[cfg(CONFIG_BLK_DEBUG_FS)] + show_rq: None, + }; + + pub(crate) const fn build() -> &'static bindings::blk_mq_ops { + &Self::VTABLE + } +} diff --git a/rust/kernel/block/mq/raw_writer.rs b/rust/kernel/block/mq/raw_= writer.rs new file mode 100644 index 000000000000..4f7e4692b592 --- /dev/null +++ b/rust/kernel/block/mq/raw_writer.rs @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::fmt::{self, Write}; + +use crate::error::Result; +use crate::prelude::EINVAL; + +/// A mutable reference to a byte buffer where a string can be written int= o. +/// +/// # Invariants +/// +/// `buffer` is always null terminated. +pub(crate) struct RawWriter<'a> { + buffer: &'a mut [u8], + pos: usize, +} + +impl<'a> RawWriter<'a> { + /// Create a new `RawWriter` instance. + fn new(buffer: &'a mut [u8]) -> Result> { + *(buffer.last_mut().ok_or(EINVAL)?) =3D 0; + + // INVARIANT: We null terminated the buffer above + Ok(Self { buffer, pos: 0 }) + } + + pub(crate) fn from_array( + a: &'a mut [core::ffi::c_char; N], + ) -> Result> { + Self::new( + // SAFETY: the buffer of `a` is valid for read and write as `u= 8` for + // at least `N` bytes. + unsafe { core::slice::from_raw_parts_mut(a.as_mut_ptr().cast::= (), N) }, + ) + } +} + +impl Write for RawWriter<'_> { + fn write_str(&mut self, s: &str) -> fmt::Result { + let bytes =3D s.as_bytes(); + let len =3D bytes.len(); + + // We do not want to overwrite our null terminator + if self.pos + len > self.buffer.len() - 1 { + return Err(fmt::Error); + } + + // INVARIANT: We are not overwriting the last byte + self.buffer[self.pos..self.pos + len].copy_from_slice(bytes); + + self.pos +=3D len; + + Ok(()) + } +} diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request= .rs new file mode 100644 index 000000000000..db5d760615d7 --- /dev/null +++ b/rust/kernel/block/mq/request.rs @@ -0,0 +1,227 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module provides a wrapper for the C `struct request` type. +//! +//! C header: [`include/linux/blk-mq.h`](srctree/include/linux/blk-mq.h) + +use crate::{ + bindings, + block::mq::Operations, + error::Result, + types::{ARef, AlwaysRefCounted, Opaque}, +}; +use core::{ + marker::PhantomData, + ptr::{addr_of_mut, NonNull}, + sync::atomic::{AtomicU64, Ordering}, +}; + +/// A wrapper around a blk-mq `struct request`. This represents an IO requ= est. +/// +/// # Invariants +/// +/// * `self.0` is a valid `struct request` created by the C portion of the= kernel. +/// * The private data area associated with this request must be an initia= lized +/// and valid `RequestDataWrapper`. +/// * `self` is reference counted by atomic modification of +/// self.wrapper_ref().refcount(). +/// +#[repr(transparent)] +pub struct Request(Opaque, PhantomData); + +impl Request { + /// Create an `ARef` from a `struct request` pointer. + /// + /// # Safety + /// + /// * The caller must own a refcount on `ptr` that is transferred to t= he + /// returned `ARef`. + /// * The type invariants for `Request` must hold for the pointee of `= ptr`. + pub(crate) unsafe fn aref_from_raw(ptr: *mut bindings::request) -> ARe= f { + // INVARIANTS: By the safety requirements of this function, invari= ants are upheld. + // SAFETY: By the safety requirement of this function, we own a + // reference count that we can pass to `ARef`. + unsafe { ARef::from_raw(NonNull::new_unchecked(ptr as *const Self = as *mut Self)) } + } + + /// Notify the block layer that a request is going to be processed now. + /// + /// The block layer uses this hook to do proper initializations such as + /// starting the timeout timer. It is a requirement that block device + /// drivers call this function when starting to process a request. + /// + /// # Safety + /// + /// The caller must have exclusive ownership of `self`, that is + /// `self.wrapper_ref().refcount() =3D=3D 2`. + pub(crate) unsafe fn start_unchecked(this: &ARef) { + // SAFETY: By type invariant, `self.0` is a valid `struct request`= . By + // existence of `&mut self` we have exclusive access. + unsafe { bindings::blk_mq_start_request(this.0.get()) }; + } + + fn try_set_end(this: ARef) -> Result, ARef> { + // We can race with `TagSet::tag_to_rq` + match this.wrapper_ref().refcount().compare_exchange( + 2, + 0, + Ordering::Relaxed, + Ordering::Relaxed, + ) { + Err(_old) =3D> Err(this), + Ok(_) =3D> Ok(this), + } + } + + /// Notify the block layer that the request has been completed without= errors. + /// + /// This function will return `Err` if `this` is not the only `ARef` + /// referencing the request. + pub fn end_ok(this: ARef) -> Result<(), ARef> { + let this =3D Self::try_set_end(this)?; + let request_ptr =3D this.0.get(); + core::mem::forget(this); + + // SAFETY: By type invariant, `self.0` is a valid `struct request`= . By + // existence of `&mut self` we have exclusive access. + unsafe { bindings::blk_mq_end_request(request_ptr, bindings::BLK_S= TS_OK as _) }; + + Ok(()) + } + + /// Return a pointer to the `RequestDataWrapper` stored in the private= area + /// of the request structure. + /// + /// # Safety + /// + /// - `this` must point to a valid allocation. + pub(crate) unsafe fn wrapper_ptr(this: *mut Self) -> NonNull { + let request_ptr =3D this.cast::(); + let wrapper_ptr =3D + // SAFETY: By safety requirements for this function, `this` is= a + // valid allocation. + unsafe { bindings::blk_mq_rq_to_pdu(request_ptr).cast::() }; + // SAFETY: By C api contract, wrapper_ptr points to a valid alloca= tion + // and is not null. + unsafe { NonNull::new_unchecked(wrapper_ptr) } + } + + /// Return a reference to the `RequestDataWrapper` stored in the priva= te + /// area of the request structure. + pub(crate) fn wrapper_ref(&self) -> &RequestDataWrapper { + // SAFETY: By type invariant, `self.0` is a valid alocation. Furth= er, + // the private data associated with this request is initialized and + // valid. The existence of `&self` guarantees that the private dat= a is + // valid as a shared reference. + unsafe { Self::wrapper_ptr(self as *const Self as *mut Self).as_re= f() } + } +} + +/// A wrapper around data stored in the private area of the C `struct requ= est`. +pub(crate) struct RequestDataWrapper { + /// The Rust request refcount has the following states: + /// + /// - 0: The request is owned by C block layer. + /// - 1: The request is owned by Rust abstractions but there are no AR= ef references to it. + /// - 2+: There are `ARef` references to the request. + refcount: AtomicU64, +} + +impl RequestDataWrapper { + /// Return a reference to the refcount of the request that is embedding + /// `self`. + pub(crate) fn refcount(&self) -> &AtomicU64 { + &self.refcount + } + + /// Return a pointer to the refcount of the request that is embedding = the + /// pointee of `this`. + /// + /// # Safety + /// + /// - `this` must point to a live allocation of at least the size of `= Self`. + pub(crate) unsafe fn refcount_ptr(this: *mut Self) -> *mut AtomicU64 { + // SAFETY: Because of the safety requirements of this function, the + // field projection is safe. + unsafe { addr_of_mut!((*this).refcount) } + } +} + +// SAFETY: Exclusive access is thread-safe for `Request`. `Request` has no= `&mut +// self` methods and `&self` methods that mutate `self` are internally +// synchronzied. +unsafe impl Send for Request {} + +// SAFETY: Shared access is thread-safe for `Request`. `&self` methods that +// mutate `self` are internally synchronized` +unsafe impl Sync for Request {} + +/// Store the result of `op(target.load())` in target, returning new value= of +/// taret. +fn atomic_relaxed_op_return(target: &AtomicU64, op: impl Fn(u64) -> u64) -= > u64 { + let mut old =3D target.load(Ordering::Relaxed); + loop { + match target.compare_exchange_weak(old, op(old), Ordering::Relaxed= , Ordering::Relaxed) { + Ok(_) =3D> break, + Err(x) =3D> { + old =3D x; + } + } + } + + op(old) +} + +/// Store the result of `op(target.load)` in `target` if `target.load() != =3D +/// pred`, returning previous value of target +fn atomic_relaxed_op_unless(target: &AtomicU64, op: impl Fn(u64) -> u64, p= red: u64) -> bool { + let x =3D target.load(Ordering::Relaxed); + loop { + if x =3D=3D pred { + break; + } + if target + .compare_exchange_weak(x, op(x), Ordering::Relaxed, Ordering::= Relaxed) + .is_ok() + { + break; + } + } + + x =3D=3D pred +} + +// SAFETY: All instances of `Request` are reference counted. This +// implementation of `AlwaysRefCounted` ensure that increments to the ref = count +// keeps the object alive in memory at least until a matching reference co= unt +// decrement is executed. +unsafe impl AlwaysRefCounted for Request { + fn inc_ref(&self) { + let refcount =3D &self.wrapper_ref().refcount(); + + #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] + let updated =3D atomic_relaxed_op_unless(refcount, |x| x + 1, 0); + + #[cfg(CONFIG_DEBUG_MISC)] + if !updated { + panic!("Request refcount zero on clone") + } + } + + unsafe fn dec_ref(obj: core::ptr::NonNull) { + // SAFETY: The type invariants of `ARef` guarantee that `obj` is v= alid + // for read. + let wrapper_ptr =3D unsafe { Self::wrapper_ptr(obj.as_ptr()).as_pt= r() }; + // SAFETY: The type invariant of `Request` guarantees that the pri= vate + // data area is initialized and valid. + let refcount =3D unsafe { &*RequestDataWrapper::refcount_ptr(wrapp= er_ptr) }; + + #[cfg_attr(not(CONFIG_DEBUG_MISC), allow(unused_variables))] + let new_refcount =3D atomic_relaxed_op_return(refcount, |x| x - 1); + + #[cfg(CONFIG_DEBUG_MISC)] + if new_refcount =3D=3D 0 { + panic!("Request reached refcount zero in Rust abstractions"); + } + } +} diff --git a/rust/kernel/block/mq/tag_set.rs b/rust/kernel/block/mq/tag_set= .rs new file mode 100644 index 000000000000..4217c2b03ff3 --- /dev/null +++ b/rust/kernel/block/mq/tag_set.rs @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module provides the `TagSet` struct to wrap the C `struct blk_mq_= tag_set`. +//! +//! C header: [`include/linux/blk-mq.h`](srctree/include/linux/blk-mq.h) + +use core::pin::Pin; + +use crate::{ + bindings, + block::mq::request::RequestDataWrapper, + block::mq::{operations::OperationsVTable, Operations}, + error::{self, Error, Result}, + prelude::PinInit, + try_pin_init, + types::Opaque, +}; +use core::{convert::TryInto, marker::PhantomData}; +use macros::{pin_data, pinned_drop}; + +/// A wrapper for the C `struct blk_mq_tag_set`. +/// +/// `struct blk_mq_tag_set` contains a `struct list_head` and so must be p= inned. +#[pin_data(PinnedDrop)] +#[repr(transparent)] +pub struct TagSet { + #[pin] + inner: Opaque, + _p: PhantomData, +} + +impl TagSet { + /// Try to create a new tag set + pub fn try_new( + nr_hw_queues: u32, + num_tags: u32, + num_maps: u32, + ) -> impl PinInit { + try_pin_init!( TagSet { + inner <- unsafe {kernel::init::pin_init_from_closure(move |pla= ce: *mut Opaque| -> Result<()> { + let place =3D place.cast::(); + + // SAFETY: try_ffi_init promises that `place` is writable,= and + // zeroes is a valid bit pattern for this structure. + core::ptr::write_bytes(place, 0, 1); + + /// For a raw pointer to a struct, write a struct field wi= thout + /// creating a reference to the field + macro_rules! write_ptr_field { + ($target:ident, $field:ident, $value:expr) =3D> { + ::core::ptr::write(::core::ptr::addr_of_mut!((*$ta= rget).$field), $value) + }; + } + + // SAFETY: try_ffi_init promises that `place` is writable + write_ptr_field!(place, ops, OperationsVTable::::bu= ild()); + write_ptr_field!(place, nr_hw_queues , nr_hw_queues); + write_ptr_field!(place, timeout , 0); // 0 means defau= lt which is 30 * HZ in C + write_ptr_field!(place, numa_node , bindings::NUMA_NO_= NODE); + write_ptr_field!(place, queue_depth , num_tags); + write_ptr_field!(place, cmd_size , core::mem::size_of:= :().try_into()?); + write_ptr_field!(place, flags , bindings::BLK_MQ_F_SHO= ULD_MERGE); + write_ptr_field!(place, driver_data , core::ptr::null_= mut::()); + write_ptr_field!(place, nr_maps , num_maps); + + // SAFETY: Relevant fields of `place` are initialised above + let ret =3D bindings::blk_mq_alloc_tag_set(place); + if ret < 0 { + return Err(Error::from_errno(ret)); + } + + Ok(()) + })}, + _p: PhantomData, + }) + } + + /// Return the pointer to the wrapped `struct blk_mq_tag_set` + pub(crate) fn raw_tag_set(&self) -> *mut bindings::blk_mq_tag_set { + self.inner.get() + } +} + +#[pinned_drop] +impl PinnedDrop for TagSet { + fn drop(self: Pin<&mut Self>) { + // SAFETY: We are not moving self below + let this =3D unsafe { Pin::into_inner_unchecked(self) }; + + // SAFETY: `inner` is valid and has been properly initialised duri= ng construction. + unsafe { bindings::blk_mq_free_tag_set(this.inner.get()) }; + } +} diff --git a/rust/kernel/error.rs b/rust/kernel/error.rs index 4786d3ee1e92..469b591cf5c7 100644 --- a/rust/kernel/error.rs +++ b/rust/kernel/error.rs @@ -130,6 +130,11 @@ pub fn to_errno(self) -> core::ffi::c_int { self.0 } =20 + pub(crate) fn to_blk_status(self) -> bindings::blk_status_t { + // SAFETY: `self.0` is a valid error due to its invariant. + unsafe { bindings::errno_to_blk_status(self.0) } + } + /// Returns the error encoded as a pointer. #[allow(dead_code)] pub(crate) fn to_ptr(self) -> *mut T { diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index be68d5e567b1..beede6017bc1 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -31,6 +31,8 @@ #[cfg(not(test))] #[cfg(not(testlib))] mod allocator; +#[cfg(CONFIG_BLOCK)] +pub mod block; mod build_assert; pub mod error; pub mod init; --=20 2.44.0 From nobody Sun Feb 8 18:37:40 2026 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B879841C93 for ; Sun, 12 May 2024 18:38:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715539129; cv=none; b=DhDm5KERZP3WVTUyiuYBNHuw65pc7OK9smORglQqe1hDTRZy0BmRx9+h1VgwSFf1V5c8Xu6/wcZ6E42IgVSfAy6heUG2uDXzfR6+l77kXbsWuwPRf7egBlg//Utb9l6qrRVlOIfowLj5T9nMwAFkPz5posPTXQX1BYARU+QFeD8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715539129; c=relaxed/simple; bh=zQhfFMtf/32xi+lYgTG1ULLlEVWmIik5yzTYXylWDOs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=acABlQlsI7psS104vcxHhqKY6nMF91H4Vc6JDxZfbHi/0MLfhg1zz23o5VLEV6AgsRvJ3zEfeUXTpd1E4pPzDEvYjoO7uoiLFEYQsNtOnsTqv5MqD/8lo2NRZ5LpxRry97OnEyH806cwR/Tye4DapH7sGz0ig/iYePqAmKI3R7g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=metaspace.dk; spf=none smtp.mailfrom=metaspace.dk; dkim=pass (2048-bit key) header.d=metaspace-dk.20230601.gappssmtp.com header.i=@metaspace-dk.20230601.gappssmtp.com header.b=2FoCPYHA; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=metaspace.dk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=metaspace.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=metaspace-dk.20230601.gappssmtp.com header.i=@metaspace-dk.20230601.gappssmtp.com header.b="2FoCPYHA" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-2ad8fb779d2so2536298a91.0 for ; Sun, 12 May 2024 11:38:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=metaspace-dk.20230601.gappssmtp.com; s=20230601; t=1715539126; x=1716143926; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qEs+Uy+D6yRLJ6uVdcGDdXVWWb0Ye6WAdIDQbGkZrM4=; b=2FoCPYHAqXOoD5KsQOx9WOsfZJadt8Et6fSsgmLIbI/fcqLjn8ZLtLfues+3w4tAaW nWmbeOctmqsIE1PPyFLXyeALWcntAF0iMPM6SkH8GS5OLeE1A3kcjYEwFbi3lYeq2wOO 2/uDv1HZCp1MHYmcNa+1E1ipRTpm1FbCSgr0jTSfs2ne/jNWM226GejUxwsB96J6Q0AZ RE2h4RtOIouGpC7ew2HwpYZw7cuQh+mjUjPhjDMq3N6bP11aSq9dH5qzuxw7wunOHkEF AwfNsqHqEKSdNWEhvbp3I7rkywKyr/yWFt8O4UcrrAmochkYaOTN005ESzxLdayb0VSy 4G5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715539126; x=1716143926; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qEs+Uy+D6yRLJ6uVdcGDdXVWWb0Ye6WAdIDQbGkZrM4=; b=nUU26mTrta/C4rkryYJqUUg/JByY7YqiGP1vXoAtTdebtegljoCmyONI1N/AJdEcK2 ZmQnm0lu/bGfj0hnX04AhcTskrc/63bw9FUw7BibaoI4VHYNKtH2Bdh0WuQ/piyInWYL ks3TfdGG7ZUyhK3lok2LkPxmzyVsAJJpYFRV08JEsf7lAV4Ujp59hPPxzDqYYO9r8HLh 7KYo3+RU0FYzE08byVHvoXG4jFnWv5XFOcS0L5b6aUA2qh65YM+Vgnlwg5fPNY2OsCu0 IBEzsMDp9RDWpe7/E1Ey/XpzjHVCP/fF5SAacxQNWVMdtX3jKx8tJn8vzsvFVf9gDrZ6 DRUg== X-Forwarded-Encrypted: i=1; AJvYcCWxNJ1IQepTLw0Szl/DOt2gvFgBYy/UrxkAo4X8jVHNHDKNccFoTryoRddQzFoBVVwdTNPb6mtKUjF5FY9zVtejt7v70vhReG48XDwq X-Gm-Message-State: AOJu0YyekOeixj9ctfZ6hVjvGGLY73rL68GCfJvIOlp8tCOLSMrAjFa8 YVbn184ibOATejBo2IZKhmqc6sgukE6ajNS7mj6wjraMe36BwDjjaKKoBKyIVIw= X-Google-Smtp-Source: AGHT+IFzxn8oM9sRHT5lPi1yaMqFudJyCRrmgvRN7lZpQ2U2sZg3XiNi4QXfSNB23XmaVSG/z9g+3Q== X-Received: by 2002:a17:903:41c4:b0:1eb:50fd:7875 with SMTP id d9443c01a7336-1ef43e2323cmr102491605ad.33.1715539125697; Sun, 12 May 2024 11:38:45 -0700 (PDT) Received: from localhost ([50.204.89.33]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ef0c136a59sm65200835ad.237.2024.05.12.11.38.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 May 2024 11:38:45 -0700 (PDT) From: Andreas Hindborg To: Jens Axboe , Christoph Hellwig , Keith Busch , Damien Le Moal , Bart Van Assche , Hannes Reinecke , Ming Lei , "linux-block@vger.kernel.org" Cc: Andreas Hindborg , Greg KH , Matthew Wilcox , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?= , Benno Lossin , Alice Ryhl , Chaitanya Kulkarni , Luis Chamberlain , Yexuan Yang <1182282462@bupt.edu.cn>, =?UTF-8?q?Sergio=20Gonz=C3=A1lez=20Collado?= , Joel Granados , "Pankaj Raghav (Samsung)" , Daniel Gomez , Niklas Cassel , Philipp Stanner , Conor Dooley , Johannes Thumshirn , =?UTF-8?q?Matias=20Bj=C3=B8rling?= , open list , "rust-for-linux@vger.kernel.org" , "lsf-pc@lists.linux-foundation.org" , "gost.dev@samsung.com" Subject: [PATCH 2/3] rust: block: add rnull, Rust null_blk implementation Date: Sun, 12 May 2024 12:39:47 -0600 Message-ID: <20240512183950.1982353-3-nmi@metaspace.dk> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240512183950.1982353-1-nmi@metaspace.dk> References: <20240512183950.1982353-1-nmi@metaspace.dk> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Andreas Hindborg This patch adds an initial version of the Rust null block driver. Signed-off-by: Andreas Hindborg --- drivers/block/Kconfig | 9 +++++ drivers/block/Makefile | 3 ++ drivers/block/rnull.rs | 82 ++++++++++++++++++++++++++++++++++++++++++ scripts/Makefile.build | 2 +- 4 files changed, 95 insertions(+), 1 deletion(-) create mode 100644 drivers/block/rnull.rs diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig index 5b9d4aaebb81..ed209f4f2798 100644 --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -354,6 +354,15 @@ config VIRTIO_BLK This is the virtual block driver for virtio. It can be used with QEMU based VMMs (like KVM or Xen). Say Y or M. =20 +config BLK_DEV_RUST_NULL + tristate "Rust null block driver (Experimental)" + depends on RUST + help + This is the Rust implementation of the null block driver. For now it + is only a minimal stub. + + If unsure, say N. + config BLK_DEV_RBD tristate "Rados block device (RBD)" depends on INET && BLOCK diff --git a/drivers/block/Makefile b/drivers/block/Makefile index 101612cba303..1105a2d4fdcb 100644 --- a/drivers/block/Makefile +++ b/drivers/block/Makefile @@ -9,6 +9,9 @@ # needed for trace events ccflags-y +=3D -I$(src) =20 +obj-$(CONFIG_BLK_DEV_RUST_NULL) +=3D rnull_mod.o +rnull_mod-y :=3D rnull.o + obj-$(CONFIG_MAC_FLOPPY) +=3D swim3.o obj-$(CONFIG_BLK_DEV_SWIM) +=3D swim_mod.o obj-$(CONFIG_BLK_DEV_FD) +=3D floppy.o diff --git a/drivers/block/rnull.rs b/drivers/block/rnull.rs new file mode 100644 index 000000000000..80e240a95446 --- /dev/null +++ b/drivers/block/rnull.rs @@ -0,0 +1,82 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This is a Rust implementation of the C null block driver. +//! +//! Supported features: +//! +//! - blk-mq interface +//! - direct completion +//! - block size 4k +//! +//! The driver is not configurable. + +use kernel::{ + block::mq::{ + self, + gen_disk::{self, GenDisk}, + Operations, TagSet, + }, + error::Result, + new_mutex, pr_info, + prelude::*, + sync::{Arc, Mutex}, + types::ARef, +}; + +module! { + type: NullBlkModule, + name: "rnull_mod", + author: "Andreas Hindborg", + license: "GPL v2", +} + +struct NullBlkModule { + _disk: Pin>>>, +} + +fn add_disk(tagset: Arc>) -> Result> { + let block_size: u16 =3D 4096; + if block_size % 512 !=3D 0 || !(512..=3D4096).contains(&block_size) { + return Err(kernel::error::code::EINVAL); + } + + let mut disk =3D gen_disk::try_new(tagset)?; + disk.set_name(format_args!("rnullb{}", 0))?; + disk.set_capacity_sectors(4096 << 11); + disk.set_queue_logical_block_size(block_size.into()); + disk.set_queue_physical_block_size(block_size.into()); + disk.set_rotational(false); + disk.add() +} + +impl kernel::Module for NullBlkModule { + fn init(_module: &'static ThisModule) -> Result { + pr_info!("Rust null_blk loaded\n"); + let tagset =3D Arc::pin_init(TagSet::try_new(1, 256, 1))?; + let disk =3D Box::pin_init(new_mutex!(add_disk(tagset)?, "nullb:di= sk"))?; + + Ok(Self { _disk: disk }) + } +} + +struct NullBlkDevice; + +#[vtable] +impl Operations for NullBlkDevice { + #[inline(always)] + fn queue_rq(rq: ARef>, _is_last: bool) -> Result { + mq::Request::end_ok(rq) + .map_err(|_e| kernel::error::code::EIO) + .expect("Failed to complete request"); + + Ok(()) + } + + fn commit_rqs() {} + + fn complete(rq: ARef>) { + mq::Request::end_ok(rq) + .map_err(|_e| kernel::error::code::EIO) + .expect("Failed to complete request") + } +} diff --git a/scripts/Makefile.build b/scripts/Makefile.build index baf86c0880b6..603dee4b66c4 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -263,7 +263,7 @@ $(obj)/%.lst: $(src)/%.c FORCE # Compile Rust sources (.rs) # ------------------------------------------------------------------------= --- =20 -rust_allowed_features :=3D new_uninit,offset_of +rust_allowed_features :=3D new_uninit,offset_of,allocator_api =20 # `--out-dir` is required to avoid temporaries being created by `rustc` in= the # current working directory, which may be not accessible in the out-of-tree --=20 2.44.0 From nobody Sun Feb 8 18:37:40 2026 Received: from mail-pg1-f174.google.com (mail-pg1-f174.google.com [209.85.215.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75BD24436B for ; Sun, 12 May 2024 18:38:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715539129; cv=none; b=P2SYJ4ti0WQ6kDY242/I7Jy58cJDlN3CYOq6yIz+Aui09HV+VkmtvrwL+7V+WMrBt4P4iiwpic4Xffx7EmO0xooeorH5SlrDQEA9kd7N6XKlaaiCV+AYtHspcVQNkEGNsj9jfmDWSZcjLZpwSruFLIbJg5f9jXuHO72jqEn3rnk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715539129; c=relaxed/simple; bh=X8aQjTXaTz7wY1WVgxn6ivryyj3AJO/0J1TeXIhDaVM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U49kdra9btwvT6+4vUULFV2tZpnrEC65VUV+hvhhpdgzBlviGr3lYYQohITTxM+46sLvZ+aiwryiws4N68pWOF8XbS9KXjzQw0BNmlad/xLb2JlRYOJ40fr7SCLh9DAlM5qvuw03FANLf0plzB6SCpgZWNta7+UHdy95uKivcrA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=metaspace.dk; spf=none smtp.mailfrom=metaspace.dk; dkim=pass (2048-bit key) header.d=metaspace-dk.20230601.gappssmtp.com header.i=@metaspace-dk.20230601.gappssmtp.com header.b=g4BXX3PX; arc=none smtp.client-ip=209.85.215.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=metaspace.dk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=metaspace.dk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=metaspace-dk.20230601.gappssmtp.com header.i=@metaspace-dk.20230601.gappssmtp.com header.b="g4BXX3PX" Received: by mail-pg1-f174.google.com with SMTP id 41be03b00d2f7-6001399f22bso2427525a12.0 for ; Sun, 12 May 2024 11:38:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=metaspace-dk.20230601.gappssmtp.com; s=20230601; t=1715539127; x=1716143927; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WquW2eXLvuryeDWwo3x+NpX21TqfTU55S+tN12tCkFw=; b=g4BXX3PXyyECLds67bS2H6WXeR/UfrNFHW/SSPh1WHPhmEPD3qKbfqx1Ynhme/N72E pQ0Qh02btSKKdDCnpbdJ9gR0N6LjqobQWe7OBBEGS2bsKlH5YN3lznuXtk8RBgWzOqci SMzfvP4jXW0dKuJ1zFxT6ZP633oH6wi7Z7YZm5ULO9JOk0qhWiV0XWC3qNN1PVejrATs VLwdbHN+ejVh/oXBMnixyP50w5M8hIKWPKAgOfhGVhn60LSadL0RkLVWeZTIOT3P6lID qi45dPtVJykNqzGFk/W1O4XIoGm1A1vh5FQcqcEBXuNuZuCiyQJ/RWyEY1dSIfaU1zIL 2Zdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715539127; x=1716143927; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WquW2eXLvuryeDWwo3x+NpX21TqfTU55S+tN12tCkFw=; b=gGVwHHKVf9MNb8ovLyZ4o+XDiKHX8ozBH0R8UL4lqNLhBqY5FBbW66M6ZNwafV2a/f sVUyUnqkTMlUy/G6Hy9Bt7gJqXZ5rw9l/lSczOaV+FnuotHPFggzD4/7a6dpGXzut9i5 qqP4+u52+fcAtKQpwFrwYUxRIKE4aBoEVbewKUPaCIJoVdSZAni/FdlQ2oMS/gIOZoeB zW2gIgRNr79Er/DK/DhgL0Zl89tpOYCwZtwYDc6Nr6mOdASNSlSLJRReMc03j6WAPnuZ 3rBG/rRjyU1Saz5tMhWSnJXQ4DKMesKkE6VCbieQb3MnEi3e3wl0dME8KLv6poQhSySq tfYQ== X-Forwarded-Encrypted: i=1; AJvYcCVZgOBKShLfoqo2MpX7E8wDmRbL3cdsXUaa8ISz4fP7WfQhcNc4+qMxOBD3VHjJdMR6mT8h2E5tuJJUJkU7a+BHtspX9TqP9dfk0Xt8 X-Gm-Message-State: AOJu0YxX6plRVFLphLVSnrTTNaXiBgOriCb2XFTVt5PP6cksZEe2reb1 B1f4g1xwdZd//oNQF9clDAamn01wGMdGrnJZCbDVHQE0Mt0zqID50fKXJ6s//es= X-Google-Smtp-Source: AGHT+IH/CCntDThEMQQPIkw803QnM6CB5WijcOgjbfKOhlQqjrhZ/zTwXG5H6FepkU1HrUZVWQLPDQ== X-Received: by 2002:a17:902:7287:b0:1e4:c959:2b65 with SMTP id d9443c01a7336-1ef43e28484mr75605185ad.41.1715539126701; Sun, 12 May 2024 11:38:46 -0700 (PDT) Received: from localhost ([50.204.89.33]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ef0b9d1807sm64897775ad.59.2024.05.12.11.38.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 May 2024 11:38:46 -0700 (PDT) From: Andreas Hindborg To: Jens Axboe , Christoph Hellwig , Keith Busch , Damien Le Moal , Bart Van Assche , Hannes Reinecke , Ming Lei , "linux-block@vger.kernel.org" Cc: Andreas Hindborg , Greg KH , Matthew Wilcox , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?UTF-8?q?Bj=C3=B6rn=20Roy=20Baron?= , Benno Lossin , Alice Ryhl , Chaitanya Kulkarni , Luis Chamberlain , Yexuan Yang <1182282462@bupt.edu.cn>, =?UTF-8?q?Sergio=20Gonz=C3=A1lez=20Collado?= , Joel Granados , "Pankaj Raghav (Samsung)" , Daniel Gomez , Niklas Cassel , Philipp Stanner , Conor Dooley , Johannes Thumshirn , =?UTF-8?q?Matias=20Bj=C3=B8rling?= , open list , "rust-for-linux@vger.kernel.org" , "lsf-pc@lists.linux-foundation.org" , "gost.dev@samsung.com" Subject: [PATCH 3/3] MAINTAINERS: add entry for Rust block device driver API Date: Sun, 12 May 2024 12:39:48 -0600 Message-ID: <20240512183950.1982353-4-nmi@metaspace.dk> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240512183950.1982353-1-nmi@metaspace.dk> References: <20240512183950.1982353-1-nmi@metaspace.dk> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Andreas Hindborg Add an entry for the Rust block device driver abstractions. Signed-off-by: Andreas Hindborg --- MAINTAINERS | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index aea47e04c3a5..eec752f09d43 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3657,6 +3657,20 @@ F: include/linux/blk* F: kernel/trace/blktrace.c F: lib/sbitmap.c =20 +BLOCK LAYER DEVICE DRIVER API [RUST] +M: Andreas Hindborg +R: Boqun Feng +L: linux-block@vger.kernel.org +L: rust-for-linux@vger.kernel.org +S: Supported +W: https://rust-for-linux.com +B: https://github.com/Rust-for-Linux/linux/issues +C: https://rust-for-linux.zulipchat.com/#narrow/stream/Block +T: git https://github.com/Rust-for-Linux/linux.git rust-block-next +F: drivers/block/rnull.rs +F: rust/kernel/block.rs +F: rust/kernel/block/ + BLOCK2MTD DRIVER M: Joern Engel L: linux-mtd@lists.infradead.org --=20 2.44.0