On Fri, Mar 27, 2026 at 11:48 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 3/27/26 4:47 AM, Chengkaitao wrote:
> > I have been working on adding a new BPF-based I/O scheduler. It has both
> > kernel and user-space parts. In kernel space, using per-ctx,
>
> Does "ctx" perhaps refer to struct blk_mq_hw_ctx? If so, please use the
> abbreviation "hctx" to prevent confusion with struct blk_mq_ctx (block
> layer software queue).
Here, ctx refers to struct blk_mq_ctx. The intent is: when no eBPF
policy is attached, the new I/O scheduler behaves like the none
scheduler; when an eBPF policy is attached, the blk_mq_ctx queues
maintained by the new I/O scheduler serve as backup and fallback
for the eBPF program.
> For what type of block devices is this new type of I/O scheduler
> intended? This new type of I/O scheduler is not appropriate for hard
> disks. To schedule I/O effectively for harddisks, an I/O scheduler must
> be aware of all pending I/O requests. This is why the mq-deadline I/O
> scheduler maintains a single list of requests across all hardware
> queues.
This new I/O scheduler targets mechanical hard disks. The scheduler
can be aware of all pending I/O requests, and users can maintain a
single list of requests in an eBPF program.
> Additionally, this new I/O scheduler is not appropriate for the fastest
> block devices. For very fast block devices, any I/O scheduler incurs a
> measurable overhead.
For the fastest block devices, avoiding extra scheduling policy is
often the best policy. In some customized workloads, however, extra
policy may be needed, for example priority scheduling, cgroup-aware
differentiation, or fine-grained metrics. Those scenarios have not
been validated with real demos yet, but the approach seems viable.
> > I implemented
> > a simple elevator that exposes a set of BPF hooks. The goal is to move the
> > policy side of I/O scheduling out of the kernel and into user space,
>
> What does "into user space" mean in this context? As you know BPF code
> runs in kernel context.
Sorry, my wording was inaccurate. It would be more appropriate to
phrase it as "into user-defined BPF programs".
--
Yours,
Chengkaitao