From nobody Fri Dec 19 19:23:40 2025 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E372288CA3; Fri, 6 Jun 2025 13:56:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218209; cv=none; b=L2+Lie/h0t3xkFrjjtZrbztNc/Pd+8MBj8nnz02meyNPUCStWnVMPplcJGSUkChyzsahwBYVUVyTkaPW/VF+7FW+zDf+b7DzucExIrIL11Ki/TTaJDNgXOF5sma0ymHeWSVk1jPUwpswgKzVjOVTZ3XNTpbYTEhBod2Z3c4H894= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218209; c=relaxed/simple; bh=i3124dRDrSgAJDr+S4wMhaq4i3Iio62CzkpMX/msx8I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g9spgI34xHbRYVYA1UVdtZFy/XycV0lI2ok2kPmVDkqoGNIpK+TrnJR1hr+PHPAhVm91IPAmbbQDtdrt8Miap9hGFcjGGscuehheY8Fjf2F0+1C+Yerf+f+rYU6flUO7DHzQAccIH9JX6aUe1+41IQ8x7NrSo+dmY+bVyrodov0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WMJHFBxe; arc=none smtp.client-ip=209.85.218.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WMJHFBxe" Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-ad89333d603so380697066b.2; Fri, 06 Jun 2025 06:56:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749218204; x=1749823004; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nWMe+CPV752eWO5pJOHKZI3Zz5P7ROmouZdNiOUAjLg=; b=WMJHFBxehjaW++XnXc+MrzQX12bULfmTAZpto2DdIz9Tvi6rV8Y1a+zeYNmL3j35LE JIrRgIRZd0bT8DGH1+tQ0XTjfBogKb4ygTEyZcfx0v6ME5nhTn/dV0AaD92dOpFxgP0J rSUQN2vQ1UA4KgzlNqyKSpspglkdrWBXA5zgRYDAtSdJG3vldSu0y5C7e9OLLaogWtBt WBcB8RsFUHfKya7opXZW1XI+teKobt+FGZm97wlF38iTd7/y47pPwAvi8tQ0JDYGpkjA eaqO+lDqdz2X36toXTe2b1BJ2yJDWYMtX086kXXwKoMDUdErOmOR6BND4x37+HdTzTaz mBtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749218204; x=1749823004; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nWMe+CPV752eWO5pJOHKZI3Zz5P7ROmouZdNiOUAjLg=; b=kznFAIu631ZyUg4oUV9uh0I9EHSXf99YdAHI6d88AwhoQDncQLXTGO+1ejChrTwfHd 3RndVlUfEswnzE1mp9SwxqVq4VZ2fzBG/WTRrAMrFjDvLVpA+it80IM5qpM0FwcrJBy0 3yuBpfk1fY2asf81M78zuQ43VW6sUa0qEtsnF8CL6fVhG6dYxXtPPiwK8oJhGOy73ASn 6v46pxpQ5Sg642czfTVSQwWoZa3GYqLTHvSUZ6/9sQCmdz58ZxoOE27gbEdHOE/42pK2 /kDa5mhcbI7FVcDljVpwMUsLzFpR5iVm6IQj/G6c4gumyTaPqheyo/ZqAjVqyd0HGsNZ dqHw== X-Forwarded-Encrypted: i=1; AJvYcCXAQxkhLXQRLwiU3NEKYok0CI0FIAEPs1yu8NTWc7dytwxSGy4fvnW0XoXPKjVBg6eBRvb+PuEEAHDy3D14@vger.kernel.org, AJvYcCXzYfK3J0SONxsvKrOdvg34+F2fOyjiAYsBPxE6TbSS1ztosV5tvrzbEAODz2TbjWL02rw=@vger.kernel.org X-Gm-Message-State: AOJu0YxisZlZjLNUw2k6+uvuXTsFf+Cq8/+zCrBWBX4LJeutnD4R+SMu BHQ0W+xav5hpV6NZAh1gq4txkOVfGWn9MGs0UJQ6wsdWXVU1wGnFtWLFHTXCcA== X-Gm-Gg: ASbGncsq6g1599UWy/ZOUtRf0DGr3nOCKxY8Pyc0ZMf1CnENQ9lGCOfAPQUq+cH2FXj VoV82OL7S6VfuL+FIOn0LIahs72xb5HHqPpmjF4704w9aTna30E7mXN1thtxUsG3F7zJxcLOeU7 cfnL2FwjmHhWLSWxfFk/GGM2CcCC8OQ9+AbPuDr6qpcpCWWz82EHZHOxYJ6qQsEs8RhxRKuLqnj 8ipeCZRHpieZXp4/VnbnbKY1ZYPYTj1kXMK6yXFdE3iRlHMWkTm4pGlD2pF0+mh/kHRDEVBo7Pw LtKyEsCnZ1vqA/SPbFOEq2sbaSBVGI/nEdeSYgdv5L0tKQ== X-Google-Smtp-Source: AGHT+IGTHo98cVWlTElkkwQOFkdLagcBVYC8pG8zXRDACYh8UdDhDdzYtg/jdxODl6/+g+QxBQfFvQ== X-Received: by 2002:a17:907:7f2a:b0:ace:d710:a8d1 with SMTP id a640c23a62f3a-ade1a90a2f4mr306337766b.24.1749218204198; Fri, 06 Jun 2025 06:56:44 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::1:a199]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ade1dc379f6sm118026766b.110.2025.06.06.06.56.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jun 2025 06:56:43 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com, Martin KaFai Lau , bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v2 1/5] io_uring: add struct for state controlling cqwait Date: Fri, 6 Jun 2025 14:57:58 +0100 Message-ID: <933217fc63d9f7753e0e3e8dc239ba1a3f15add4.1749214572.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add struct iou_loop_state and place there parameter controlling the flow of normal CQ waiting. It will be exposed to BPF for api of the helpers, and while I could've used struct io_wait_queue, the name is not ideal, and keeping only necessary bits makes further development a bit cleaner. Signed-off-by: Pavel Begunkov --- io_uring/io_uring.c | 20 ++++++++++---------- io_uring/io_uring.h | 11 ++++++++--- io_uring/napi.c | 4 ++-- 3 files changed, 20 insertions(+), 15 deletions(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 5cdccf65c652..9cc4d8f335a1 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2404,8 +2404,8 @@ static enum hrtimer_restart io_cqring_min_timer_wakeu= p(struct hrtimer *timer) struct io_ring_ctx *ctx =3D iowq->ctx; =20 /* no general timeout, or shorter (or equal), we are done */ - if (iowq->timeout =3D=3D KTIME_MAX || - ktime_compare(iowq->min_timeout, iowq->timeout) >=3D 0) + if (iowq->state.timeout =3D=3D KTIME_MAX || + ktime_compare(iowq->min_timeout, iowq->state.timeout) >=3D 0) goto out_wake; /* work we may need to run, wake function will see if we need to wake */ if (io_has_work(ctx)) @@ -2431,7 +2431,7 @@ static enum hrtimer_restart io_cqring_min_timer_wakeu= p(struct hrtimer *timer) } =20 hrtimer_update_function(&iowq->t, io_cqring_timer_wakeup); - hrtimer_set_expires(timer, iowq->timeout); + hrtimer_set_expires(timer, iowq->state.timeout); return HRTIMER_RESTART; out_wake: return io_cqring_timer_wakeup(timer); @@ -2447,7 +2447,7 @@ static int io_cqring_schedule_timeout(struct io_wait_= queue *iowq, hrtimer_setup_on_stack(&iowq->t, io_cqring_min_timer_wakeup, clock_id, HRTIMER_MODE_ABS); } else { - timeout =3D iowq->timeout; + timeout =3D iowq->state.timeout; hrtimer_setup_on_stack(&iowq->t, io_cqring_timer_wakeup, clock_id, HRTIMER_MODE_ABS); } @@ -2488,7 +2488,7 @@ static int __io_cqring_wait_schedule(struct io_ring_c= tx *ctx, */ if (ext_arg->iowait && current_pending_io()) current->in_iowait =3D 1; - if (iowq->timeout !=3D KTIME_MAX || iowq->min_timeout) + if (iowq->state.timeout !=3D KTIME_MAX || iowq->min_timeout) ret =3D io_cqring_schedule_timeout(iowq, ctx->clockid, start_time); else schedule(); @@ -2546,18 +2546,18 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, = int min_events, u32 flags, iowq.wq.private =3D current; INIT_LIST_HEAD(&iowq.wq.entry); iowq.ctx =3D ctx; - iowq.cq_tail =3D READ_ONCE(ctx->rings->cq.head) + min_events; + iowq.state.target_cq_tail =3D READ_ONCE(ctx->rings->cq.head) + min_events; iowq.cq_min_tail =3D READ_ONCE(ctx->rings->cq.tail); iowq.nr_timeouts =3D atomic_read(&ctx->cq_timeouts); iowq.hit_timeout =3D 0; iowq.min_timeout =3D ext_arg->min_time; - iowq.timeout =3D KTIME_MAX; + iowq.state.timeout =3D KTIME_MAX; start_time =3D io_get_time(ctx); =20 if (ext_arg->ts_set) { - iowq.timeout =3D timespec64_to_ktime(ext_arg->ts); + iowq.state.timeout =3D timespec64_to_ktime(ext_arg->ts); if (!(flags & IORING_ENTER_ABS_TIMER)) - iowq.timeout =3D ktime_add(iowq.timeout, start_time); + iowq.state.timeout =3D ktime_add(iowq.state.timeout, start_time); } =20 if (ext_arg->sig) { @@ -2582,7 +2582,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, in= t min_events, u32 flags, =20 /* if min timeout has been hit, don't reset wait count */ if (!iowq.hit_timeout) - nr_wait =3D (int) iowq.cq_tail - + nr_wait =3D (int) iowq.state.target_cq_tail - READ_ONCE(ctx->rings->cq.tail); else nr_wait =3D 1; diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index 0ea7a435d1de..edf698b81a95 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -39,15 +39,19 @@ enum { IOU_REQUEUE =3D -3072, }; =20 +struct iou_loop_state { + __u32 target_cq_tail; + ktime_t timeout; +}; + struct io_wait_queue { + struct iou_loop_state state; struct wait_queue_entry wq; struct io_ring_ctx *ctx; - unsigned cq_tail; unsigned cq_min_tail; unsigned nr_timeouts; int hit_timeout; ktime_t min_timeout; - ktime_t timeout; struct hrtimer t; =20 #ifdef CONFIG_NET_RX_BUSY_POLL @@ -59,7 +63,8 @@ struct io_wait_queue { static inline bool io_should_wake(struct io_wait_queue *iowq) { struct io_ring_ctx *ctx =3D iowq->ctx; - int dist =3D READ_ONCE(ctx->rings->cq.tail) - (int) iowq->cq_tail; + u32 target =3D iowq->state.target_cq_tail; + int dist =3D READ_ONCE(ctx->rings->cq.tail) - target; =20 /* * Wake up if we have enough events, or if a timeout occurred since we diff --git a/io_uring/napi.c b/io_uring/napi.c index 4a10de03e426..e08bddc1dbd2 100644 --- a/io_uring/napi.c +++ b/io_uring/napi.c @@ -360,8 +360,8 @@ void __io_napi_busy_loop(struct io_ring_ctx *ctx, struc= t io_wait_queue *iowq) return; =20 iowq->napi_busy_poll_dt =3D READ_ONCE(ctx->napi_busy_poll_dt); - if (iowq->timeout !=3D KTIME_MAX) { - ktime_t dt =3D ktime_sub(iowq->timeout, io_get_time(ctx)); + if (iowq->state.timeout !=3D KTIME_MAX) { + ktime_t dt =3D ktime_sub(iowq->state.timeout, io_get_time(ctx)); =20 iowq->napi_busy_poll_dt =3D min_t(u64, iowq->napi_busy_poll_dt, dt); } --=20 2.49.0 From nobody Fri Dec 19 19:23:40 2025 Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93409288C81; Fri, 6 Jun 2025 13:56:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218209; cv=none; b=nCwDkRTcx7l5C7diEN4TujVmhr0QDbMpHLoQ7waVYYwYgaRUsbn1GvRRsoTFk3vzwd52eKoPs80VTiB5WVnDcrjBKF6tg1w8AYaIeNm373I9C++rMM6RwQOguCtZSF5HqB6jAtgTfnKBdBYgAx5mVaF6dzVfLA/WxrI1rMIV4D4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218209; c=relaxed/simple; bh=ViC3IbHgyZt8fGF8lVTF8z4dPU4ffOy+7hz/L+RoQG4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=icHSxpy/GXOSJU+hwBIzbtEF/57XUiL3FjjFOYesa6Eq+JpUADHryfS21Q+WL/eaioh+2SZJdVwGmW1Asa6hd8J0w7F9hxkLX9USUvdFvY6j2/cyXSb7oquhbOMcGxQp/t5YBI+MQiR3TaOmqv0NrVFy9qDyFLBaMUIw9uYHBpY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=GF/KB77s; arc=none smtp.client-ip=209.85.208.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GF/KB77s" Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-60497d07279so3980611a12.3; Fri, 06 Jun 2025 06:56:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749218205; x=1749823005; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WR0oSDemJ8McuKWbT8yQkZjejC+oeGCjvdBXaJShshE=; b=GF/KB77s810NczF6lb0xLkLkJn/E3qDQ1KuH8phAYUInIAhexh+6QNlYMjybl3Q5kS GZRkPzzO+nqHRJ7DpdwLrrBK6HxQnbM4Q2EExc16HJ7/K0gxSWRWFHvOQbwCO0HzZeqm +hgMHeQKoqgb7wmxHgyXHYSpcn0dnaEbgpw1bZIGMNq+PpQ/Tl7NJHMX3V/1TRueJvCa RkjRs5+nWh/Y6Pnr+quD7UWmUi7mB2zsVR6VFBm0ieWAiHbJnbwh0g6ssSA7DHYkelHQ XCsCIOYDjHZDM2yNN1r+NR/rAHeG3wB4H30Myru16OUpDtT4PZpCbY3OOwCJn7xR0WVj Hq1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749218205; x=1749823005; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WR0oSDemJ8McuKWbT8yQkZjejC+oeGCjvdBXaJShshE=; b=Dqq/DJ0gZiLIJ/UDLkQwHfa31RFdji0wpVUJKmh29hoyRzqrPeyy1HjHLRb0PdYgAp +v+8PI+0f1//hj0pmFhFphXvUuqZwDXOEexJg8LrB/u4mC34RAf5XQ8fmKhauswA4TLE BphNoPm6PV0t6y6/5OlYa9W0HsQW2YUFtXB+i6+6F8UAOU3Wo2Jk/WE8rsUrJJIpKmEv T0WuQbHP2oUIv+qNVlXBCN3m7tHuimsiEDPkimQQFXImYdkoyW6DWuqkjvBNw7ovIKL3 /biv3fvXwSVQRNAs0PGO8xgy1r+CBfkFfOKxGVmqDbzUuGJKGGG1DbPFKrupShNCO7jc C5Ig== X-Forwarded-Encrypted: i=1; AJvYcCVlV9BySdA22FYoYYYJo2GOjRi6WJnQl3w0otbbJC4osKfr7ga6tcpsBGByV6Potp2SQ8K3iRTGHsYnFKR+@vger.kernel.org, AJvYcCXBywf826u3s4xNyr5pGutz5AcL8CGxxEQEG39ZNH0nkSD3izqyzyPt5LzL6HjH4s4GGp8=@vger.kernel.org X-Gm-Message-State: AOJu0YyllAm/kpVxg1sX8VljGgfnXF6GRoMJywquD4AKXzmCROs7pL9S XYlUCpQVQV2ZRsBM87jkf1SbO02xFibATRNlE0Paxgg2MLPrLS6rFG3qugUzFg== X-Gm-Gg: ASbGncv6TVSGrqLvgWWqMnL9BBgXrcSZdBb7WcDy208UOFXij+NVA5ITkWjGv/MJK/9 yjrmDu2OLwN/gcrXbWQ2PDzzcPWR7DSZJ32NHPdpROlrG+oD3/HRRy2DxLGhUUc6X3bn79AW8Qu EuG6OGt+RZxKICey36lDbLgXPoMaH0DWo6/cr7fQgzreeKS8nlClybX84M10XSnSSwOLXsCMsDJ NKZYXUzFFQFmO4VbyAzEKycGZMvyT1FFeJbRqKFw3KUIjZdsVxFw1IpTi3G8ZiOrnREdKdJyr7L ZBCHsYr6undqGHYJH4hs+mEhAbUPe/cv374yXhnN4QMlTw== X-Google-Smtp-Source: AGHT+IEl038+iTCO65fTXNrf0nnQgWPfEsmuWE98Gk6ylNubF4OzxPQhOmh+mxtWg0ptadhMSYcjSA== X-Received: by 2002:a17:906:d54e:b0:ad5:1c28:3c4b with SMTP id a640c23a62f3a-ade1a9ee319mr292192966b.52.1749218205205; Fri, 06 Jun 2025 06:56:45 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::1:a199]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ade1dc379f6sm118026766b.110.2025.06.06.06.56.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jun 2025 06:56:44 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com, Martin KaFai Lau , bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v2 2/5] io_uring/bpf: add stubs for bpf struct_ops Date: Fri, 6 Jun 2025 14:57:59 +0100 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add some basic helpers and definitions for implementing bpf struct_ops. There are no callbaack yet, and registration will always fail. Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 4 ++ io_uring/Kconfig | 5 ++ io_uring/Makefile | 1 + io_uring/bpf.c | 93 ++++++++++++++++++++++++++++++++++ io_uring/bpf.h | 26 ++++++++++ io_uring/io_uring.c | 3 ++ 6 files changed, 132 insertions(+) create mode 100644 io_uring/bpf.c create mode 100644 io_uring/bpf.h diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 2922635986f5..26ee1a6f52e7 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -8,6 +8,8 @@ #include #include =20 +struct io_uring_ops; + enum { /* * A hint to not wake right away but delay until there are enough of @@ -344,6 +346,8 @@ struct io_ring_ctx { =20 void *cq_wait_arg; size_t cq_wait_size; + + struct io_uring_ops *bpf_ops; } ____cacheline_aligned_in_smp; =20 /* diff --git a/io_uring/Kconfig b/io_uring/Kconfig index 4b949c42c0bf..b4dad9b74544 100644 --- a/io_uring/Kconfig +++ b/io_uring/Kconfig @@ -9,3 +9,8 @@ config IO_URING_ZCRX depends on PAGE_POOL depends on INET depends on NET_RX_BUSY_POLL + +config IO_URING_BPF + def_bool y + depends on IO_URING + depends on BPF_SYSCALL && BPF_JIT && DEBUG_INFO_BTF diff --git a/io_uring/Makefile b/io_uring/Makefile index d97c6b51d584..58f46c0f9895 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -21,3 +21,4 @@ obj-$(CONFIG_EPOLL) +=3D epoll.o obj-$(CONFIG_NET_RX_BUSY_POLL) +=3D napi.o obj-$(CONFIG_NET) +=3D net.o cmd_net.o obj-$(CONFIG_PROC_FS) +=3D fdinfo.o +obj-$(CONFIG_IO_URING_BPF) +=3D bpf.o diff --git a/io_uring/bpf.c b/io_uring/bpf.c new file mode 100644 index 000000000000..3096c54e4fb3 --- /dev/null +++ b/io_uring/bpf.c @@ -0,0 +1,93 @@ +#include + +#include "bpf.h" +#include "register.h" + +static struct io_uring_ops io_bpf_ops_stubs =3D { +}; + +static bool bpf_io_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + if (type !=3D BPF_READ) + return false; + if (off < 0 || off >=3D sizeof(__u64) * MAX_BPF_FUNC_ARGS) + return false; + if (off % size !=3D 0) + return false; + + return btf_ctx_access(off, size, type, prog, info); +} + +static int bpf_io_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, int off, + int size) +{ + return -EACCES; +} + +static const struct bpf_verifier_ops bpf_io_verifier_ops =3D { + .get_func_proto =3D bpf_base_func_proto, + .is_valid_access =3D bpf_io_is_valid_access, + .btf_struct_access =3D bpf_io_btf_struct_access, +}; + +static int bpf_io_init(struct btf *btf) +{ + return 0; +} + +static int bpf_io_check_member(const struct btf_type *t, + const struct btf_member *member, + const struct bpf_prog *prog) +{ + return 0; +} + +static int bpf_io_init_member(const struct btf_type *t, + const struct btf_member *member, + void *kdata, const void *udata) +{ + return 0; +} + +static int bpf_io_reg(void *kdata, struct bpf_link *link) +{ + return -EOPNOTSUPP; +} + +static void bpf_io_unreg(void *kdata, struct bpf_link *link) +{ +} + +void io_unregister_bpf_ops(struct io_ring_ctx *ctx) +{ +} + +static struct bpf_struct_ops bpf_io_uring_ops =3D { + .verifier_ops =3D &bpf_io_verifier_ops, + .reg =3D bpf_io_reg, + .unreg =3D bpf_io_unreg, + .check_member =3D bpf_io_check_member, + .init_member =3D bpf_io_init_member, + .init =3D bpf_io_init, + .cfi_stubs =3D &io_bpf_ops_stubs, + .name =3D "io_uring_ops", + .owner =3D THIS_MODULE, +}; + +static int __init io_uring_bpf_init(void) +{ + int ret; + + ret =3D register_bpf_struct_ops(&bpf_io_uring_ops, io_uring_ops); + if (ret) { + pr_err("io_uring: Failed to register struct_ops (%d)\n", ret); + return ret; + } + + return 0; +} +__initcall(io_uring_bpf_init); diff --git a/io_uring/bpf.h b/io_uring/bpf.h new file mode 100644 index 000000000000..a61c489d306b --- /dev/null +++ b/io_uring/bpf.h @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef IOU_BPF_H +#define IOU_BPF_H + +#include +#include + +#include "io_uring.h" + +struct io_uring_ops { +}; + +static inline bool io_bpf_attached(struct io_ring_ctx *ctx) +{ + return IS_ENABLED(CONFIG_BPF) && ctx->bpf_ops !=3D NULL; +} + +#ifdef CONFIG_BPF +void io_unregister_bpf_ops(struct io_ring_ctx *ctx); +#else +static inline void io_unregister_bpf_ops(struct io_ring_ctx *ctx) +{ +} +#endif + +#endif \ No newline at end of file diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 9cc4d8f335a1..8f68e898d60c 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -98,6 +98,7 @@ #include "msg_ring.h" #include "memmap.h" #include "zcrx.h" +#include "bpf.h" =20 #include "timeout.h" #include "poll.h" @@ -2870,6 +2871,8 @@ static __cold void io_ring_exit_work(struct work_stru= ct *work) struct io_tctx_node *node; int ret; =20 + io_unregister_bpf_ops(ctx); + /* * If we're doing polled IO and end up having requests being * submitted async (out-of-line), then completions can come in while --=20 2.49.0 From nobody Fri Dec 19 19:23:40 2025 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60D4128937F; Fri, 6 Jun 2025 13:56:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218210; cv=none; b=fxD+MPD6kenBIlVIU7Zs82rLxrHhgbdnMDG15Vyne1DPEkPaSfPMTlUGvlzEzATaBDzK66E52nO+LtxeCaNP781biFbyPF7Jhwyx2k+FeeSZrlqtlwYLn4PFYN8QlMbYpxnMkt1DAIysm/81KWZKhq9r4eeE6+2/oThr3SPhxBo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218210; c=relaxed/simple; bh=5VMdPBhIyqMD1MmtIpltOSTRWEVW47+1088cj4tJ8pM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Kc7sPf/uyI2xCNg4QDnZ7W2UgU3rHrIiHmSTYIJLem3K4ZUdv7NfLQXVT3iVsKXi0qC5qwO+QnBG0k7/keIhtCBJS7SoMBrbvzfl5M+eMZFIDxIwskcJZPG8chzWkpYBdL4ujhdkqVji+rWoKolKBH6Lzr1smY7cC1fV81x7/Yk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HI7e2Mbd; arc=none smtp.client-ip=209.85.208.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HI7e2Mbd" Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-60497d07279so3980635a12.3; Fri, 06 Jun 2025 06:56:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749218206; x=1749823006; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Kq02ZXnReuZLLwrZBHg+lNbHT6a4SK/xczUYFbR7a/w=; b=HI7e2Mbd4La1/kwJxp8sK9qRadhKdYQmaSUSsgucjwpsdp8Q/NmYuB+GYYZgvefCWK xpgTeT1p/IHr8ahlbYVvpn8c+1OofvwvnFHX2VU299fOUKYDTK6qLac+sLrwkVHsize7 EfP8U7O7gtlVH0b4SRMbffN3rIWhFsL+tkutV5uetWWLVu1cgSi561YfMkXzXtPYQVwq 1pMJM0qnH4gBDfN036E4zMRpJZBgEl1E76nuTsiRUR7Jn4Hq8hYfyEk4Fs+7TJCwNOEy 7ZVmub4l9NfV2RzO/3dgft+sqJZQGaYRMGILxHkY36zISAx8nEcleRmZPU9xx8Gv2HiM MvZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749218206; x=1749823006; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Kq02ZXnReuZLLwrZBHg+lNbHT6a4SK/xczUYFbR7a/w=; b=FjtxfRkh2nqPYjM3hkEv5gJsDMyo1uVQiTf1BsN5qhFrirwgox9RyW+gVDUZ7+9h7i eqYV2b+dRUfVGLM0bneeNszmnIoKEKP6DcauYWBcIWQw9b2pUbucWZItFeUWC4nSXuyZ cTzr/7jfNuo7GuWk/lXcE/ODDlFNC+BAT5FvX0seUPOSUY6ugLNTEIjw7knxkETfcYK7 zBg++vmzGAHsGMN09p8f0isU/KAxj8E5D6sgVlDrt+Vk3qjxOD4n9bapqreciRZD78H8 0YqfQN1whFC4fgpMGNSKQLPsoNHW1M51RwtokZy09ZvgrTvdmwLN/pJybHVthhX4kkiV HkuA== X-Forwarded-Encrypted: i=1; AJvYcCWkS7JlAhmIS1gmC5QBElyFBPl+m20juJqew6J0pfzCULV1y4sG9qQGa1U/AQ22aFpfe14=@vger.kernel.org, AJvYcCX/1i7AEv32LCx6SvPVF9qUn6W+tm0pIAO7i3eQQlV7Jk/j2FJMmQTBjLiD25kacmii/sCOI9FxE+nVnzV4@vger.kernel.org X-Gm-Message-State: AOJu0Yz+EZlW9PA8qQTSxt23cTLM5r5JQg8IaDAwqYMy+Ctj52auakyJ urtBsy7eMZTpXY+Ovb0LCyDQvQev9fpR/rc5GvOapcIavpdXptdBitQ4UKa5vQ== X-Gm-Gg: ASbGncuY5dI0lIKWTt0H+tzsOrk8lkIdBItbd10aHJMA+b1pN8CdTa9nhwBTaWYYUqT 9LwgC6BgLmC+p+H5eBGtwpr5vsF9WVSDyFiqAFNHNobB9hlqb1dDV0f+Ek75jbLtIt/hwlyAmVS KdFibufv1LaFpVtIoLg8cBYL75C1ysI7mt2Md+lpgf2sF8fXt7W+SE8vxkZA2HtjC9OMPh5ezKp MAUp8YA9xiXmEnB3Eh5dRmPzONRlqR/SXFsCuQYinTGIolHEFKzyOCYvv5R2lv7fG8poCWuSUeB ahLZFiUj8PCeSEE/k4Aq/66JWp0ON084gzk= X-Google-Smtp-Source: AGHT+IGxiOsNefek1Bu4XB/BKFZspesjby3DK69VYtx8UdcH82CyMF+V+9+JJsat6eaOhAdeWfxkzQ== X-Received: by 2002:a17:907:3c88:b0:ad8:96d2:f41 with SMTP id a640c23a62f3a-ade1a9229f0mr321862066b.33.1749218206122; Fri, 06 Jun 2025 06:56:46 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::1:a199]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ade1dc379f6sm118026766b.110.2025.06.06.06.56.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jun 2025 06:56:45 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com, Martin KaFai Lau , bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v2 3/5] io_uring/bpf: implement struct_ops registration Date: Fri, 6 Jun 2025 14:58:00 +0100 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add ring_fd to the struct_ops and implement [un]registration. Signed-off-by: Pavel Begunkov --- io_uring/bpf.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++- io_uring/bpf.h | 3 +++ 2 files changed, 69 insertions(+), 1 deletion(-) diff --git a/io_uring/bpf.c b/io_uring/bpf.c index 3096c54e4fb3..0f82acf09959 100644 --- a/io_uring/bpf.c +++ b/io_uring/bpf.c @@ -3,6 +3,8 @@ #include "bpf.h" #include "register.h" =20 +DEFINE_MUTEX(io_bpf_ctrl_mutex); + static struct io_uring_ops io_bpf_ops_stubs =3D { }; =20 @@ -50,20 +52,83 @@ static int bpf_io_init_member(const struct btf_type *t, const struct btf_member *member, void *kdata, const void *udata) { + u32 moff =3D __btf_member_bit_offset(t, member) / 8; + const struct io_uring_ops *uops =3D udata; + struct io_uring_ops *ops =3D kdata; + + switch (moff) { + case offsetof(struct io_uring_ops, ring_fd): + ops->ring_fd =3D uops->ring_fd; + return 1; + } + return 0; +} + +static int io_register_bpf_ops(struct io_ring_ctx *ctx, struct io_uring_op= s *ops) +{ + if (ctx->bpf_ops) + return -EBUSY; + if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) + return -EOPNOTSUPP; + + percpu_ref_get(&ctx->refs); + ops->ctx =3D ctx; + ctx->bpf_ops =3D ops; return 0; } =20 static int bpf_io_reg(void *kdata, struct bpf_link *link) { - return -EOPNOTSUPP; + struct io_uring_ops *ops =3D kdata; + struct io_ring_ctx *ctx; + struct file *file; + int ret; + + file =3D io_uring_register_get_file(ops->ring_fd, false); + if (IS_ERR(file)) + return PTR_ERR(file); + + ctx =3D file->private_data; + scoped_guard(mutex, &ctx->uring_lock) + ret =3D io_register_bpf_ops(ctx, ops); + + fput(file); + return ret; } =20 static void bpf_io_unreg(void *kdata, struct bpf_link *link) { + struct io_uring_ops *ops =3D kdata; + struct io_ring_ctx *ctx; + + guard(mutex)(&io_bpf_ctrl_mutex); + + ctx =3D ops->ctx; + ops->ctx =3D NULL; + + if (ctx) { + scoped_guard(mutex, &ctx->uring_lock) { + if (ctx->bpf_ops =3D=3D ops) + ctx->bpf_ops =3D NULL; + } + percpu_ref_put(&ctx->refs); + } } =20 void io_unregister_bpf_ops(struct io_ring_ctx *ctx) { + struct io_uring_ops *ops; + + guard(mutex)(&io_bpf_ctrl_mutex); + guard(mutex)(&ctx->uring_lock); + + ops =3D ctx->bpf_ops; + ctx->bpf_ops =3D NULL; + + if (ops && ops->ctx) { + percpu_ref_put(&ctx->refs); + ops->ctx =3D NULL; + } } =20 static struct bpf_struct_ops bpf_io_uring_ops =3D { diff --git a/io_uring/bpf.h b/io_uring/bpf.h index a61c489d306b..4b147540d006 100644 --- a/io_uring/bpf.h +++ b/io_uring/bpf.h @@ -8,6 +8,9 @@ #include "io_uring.h" =20 struct io_uring_ops { + __u32 ring_fd; + + struct io_ring_ctx *ctx; }; =20 static inline bool io_bpf_attached(struct io_ring_ctx *ctx) --=20 2.49.0 From nobody Fri Dec 19 19:23:40 2025 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF6EB28982F; Fri, 6 Jun 2025 13:56:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218211; cv=none; b=s1/ctQrInS2/oPXUZWH9fdPC2WkEca0p49kwcCORYwLmmf30EhPPnzdX+aR9saC2ZuCdE9GozREtoUcY1GE9OWzDdJ6FJWCBTDgBBQLTxnJ4w4Q2O0xAy12yfrcXjGz4ajbjGiwEbtWN7WavZz6YQzEPXqhGjW3dYsf1ZtYbCBA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218211; c=relaxed/simple; bh=vyzsuUe/QoAXkeVTMOIyADJGcnKpwPxlUrNT3H0G/r4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IENpXJ8DQHVRmUTM7ts18oQcMSTNwaifQuE/55rX7sI8VMHoJZK7Lkyq230srCiPZQl/72odFjD4+2nAhdg+AyAMij684VKDCZEl9gqypn5kjGJw0H+tTlcOIF+nOJLIdvyhRJsuxw/VOavW55kmhzNfr/G4nqUV0sD/OBveXNU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=B7pSYukq; arc=none smtp.client-ip=209.85.218.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="B7pSYukq" Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-ad574992fcaso345701866b.1; Fri, 06 Jun 2025 06:56:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749218208; x=1749823008; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ITjrjjEaCa6jkl2+7O9bYXssPOa5x6lyf9m6vXE7lKo=; b=B7pSYukqGpPluOq0PQ5dnlDqae7bql+vv1OOoXfXN+2k81qDgVQYdVE9L4AcINhQkS cspNpdnLiHa7MLIY4olsVKqWCDBk10/VTmK1Vq+wgOrLJnLHcjgK04QlgH3ru4VOieXy kHftr/oeFZeRdSETsGTlplWgT47TC5EL7+9NJApkRUDEyknGc/k43brFUUjei9UqePkk SAfgNplQtjtOrwC6YDEQ/UsFXA28V8ygWFvG+P790dKxERcbxTyWSBQsOqfLfquWA2d2 fNn3qMLefUhMHIlzXE7qvDc0NXBawLdkPYV1FqIjj7X8UaH2G57emQgu5ALCHlIe/dfE IoQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749218208; x=1749823008; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ITjrjjEaCa6jkl2+7O9bYXssPOa5x6lyf9m6vXE7lKo=; b=Q7vOsKvxDwAiznNxzssR+ny/kYP354oWYTfNs0AcqFtqvvvovfSLUqC1ljvT/sDF+L AYxb+nHwJ+TQS4t8x7dQ/SNwYbC3GFP6wYlHRqEEU/axsr/tBOnORnnrDQJhuzugsMCH rbq7LJAKk5LKT4swmxIE9mkAz4yln2zv2vM90WXB2oPYA+m8sxShWmoIprIgY13p0BCX eUGK12hS+83BM5quTNUZ/bdxYXYf8gJGCOgEL7sHicCZPGEVrAhGWXlqCPTpOFZeBS/O 0JlFIDK95ThrQNMEHNqnvdGewpJfC09vs9xCYtbisQvF3FBJYzjjt8F13UE6HhRI/7O3 nNcQ== X-Forwarded-Encrypted: i=1; AJvYcCU/8Ym6YjxvtKAHBjX8zTeW5yroHnfAxdbXrFSBjQOEsS84guRTeYsXHaSH7HdViB0m/ykjF6EPkHOQRZUp@vger.kernel.org, AJvYcCX2EWo2xQQCXQcNORL/d+GILxzEhJctQzAPsbRmS0CDAzmcwXf48DauAPG2I1Bi+NxFMYI=@vger.kernel.org X-Gm-Message-State: AOJu0YwTx1jaLth27yH/pbiClktTpbi9XvrpgO7MMCOK8l9jafP5c8cl IShGjYm7FojYyaQakDCxGyRaGyicRVVssJl8fNtxFpM0kKM7xbhDSzZHu8zQuA== X-Gm-Gg: ASbGnctj6VCFcSyBRbrsni1fIQnyG5Wv7ficrErA+nHM5AV/yeT4CVU1caW6iSMsknT YLSG7n92MxvTfowYXEUU7IAdHi3A9z2CHxk7AN1SB/onuCSGBOuksIDMpVXGy6Txs59XPmsWGrx VUt95bhWZ+B8c60Ek7J2RFfM99v0HrcNFnTLU5vwqaQ3JjPunNWrvz3akZDasy3czPfX8DvX9J8 0lQcfvDOr4cHUEL/j+lD5pokGEaIjaIVe+FypIQWnGA6P3m2j6d6HD2MBsE36omSdO8x2gPQSHP g44S4yezafMZaO8WChF8Pu0+1ZrkpRyYSqKE4XahDcvTlA== X-Google-Smtp-Source: AGHT+IGsUtjamJY1/YfeofcK8ePS5kIlouljaSgzZkJheAWEKgdEXxo+ryEhnG2pmsRJYFP2dr341g== X-Received: by 2002:a17:907:9813:b0:adb:2bb2:ee2 with SMTP id a640c23a62f3a-ade1aa0c2bfmr326963666b.41.1749218207331; Fri, 06 Jun 2025 06:56:47 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::1:a199]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ade1dc379f6sm118026766b.110.2025.06.06.06.56.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jun 2025 06:56:46 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com, Martin KaFai Lau , bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v2 4/5] io_uring/bpf: add handle events callback Date: Fri, 6 Jun 2025 14:58:01 +0100 Message-ID: <1c8fcadfb605269011618e285a4d9e066542dba2.1749214572.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a struct_ops callback called handle_events, which will be called off the CQ waiting loop every time there is an event that might be interesting to the program. The program takes the io_uring ctx and also a loop state, which it can use to set the number of events it wants to wait for as well as the timeout value. Signed-off-by: Pavel Begunkov --- io_uring/bpf.c | 33 +++++++++++++++++++++++++++++++++ io_uring/bpf.h | 16 ++++++++++++++++ io_uring/io_uring.c | 22 +++++++++++++++++++++- 3 files changed, 70 insertions(+), 1 deletion(-) diff --git a/io_uring/bpf.c b/io_uring/bpf.c index 0f82acf09959..f86b12f280e8 100644 --- a/io_uring/bpf.c +++ b/io_uring/bpf.c @@ -1,11 +1,20 @@ #include +#include =20 #include "bpf.h" #include "register.h" =20 +static const struct btf_type *loop_state_type; DEFINE_MUTEX(io_bpf_ctrl_mutex); =20 +static int io_bpf_ops__handle_events(struct io_ring_ctx *ctx, + struct iou_loop_state *state) +{ + return IOU_EVENTS_STOP; +} + static struct io_uring_ops io_bpf_ops_stubs =3D { + .handle_events =3D io_bpf_ops__handle_events, }; =20 static bool bpf_io_is_valid_access(int off, int size, @@ -27,6 +36,16 @@ static int bpf_io_btf_struct_access(struct bpf_verifier_= log *log, const struct bpf_reg_state *reg, int off, int size) { + const struct btf_type *t =3D btf_type_by_id(reg->btf, reg->btf_id); + + if (t =3D=3D loop_state_type) { + if (off >=3D offsetof(struct iou_loop_state, target_cq_tail) && + off + size <=3D offsetofend(struct iou_loop_state, target_cq_tail)) + return SCALAR_VALUE; + if (off >=3D offsetof(struct iou_loop_state, timeout) && + off + size <=3D offsetofend(struct iou_loop_state, timeout)) + return SCALAR_VALUE; + } return -EACCES; } =20 @@ -36,8 +55,22 @@ static const struct bpf_verifier_ops bpf_io_verifier_ops= =3D { .btf_struct_access =3D bpf_io_btf_struct_access, }; =20 +static const struct btf_type * +io_lookup_struct_type(struct btf *btf, const char *name) +{ + s32 type_id; + + type_id =3D btf_find_by_name_kind(btf, name, BTF_KIND_STRUCT); + if (type_id < 0) + return NULL; + return btf_type_by_id(btf, type_id); +} + static int bpf_io_init(struct btf *btf) { + loop_state_type =3D io_lookup_struct_type(btf, "iou_loop_state"); + if (!loop_state_type) + return -EINVAL; return 0; } =20 diff --git a/io_uring/bpf.h b/io_uring/bpf.h index 4b147540d006..ac4a9361f9c7 100644 --- a/io_uring/bpf.h +++ b/io_uring/bpf.h @@ -7,12 +7,28 @@ =20 #include "io_uring.h" =20 +enum { + IOU_EVENTS_WAIT, + IOU_EVENTS_STOP, +}; + struct io_uring_ops { __u32 ring_fd; =20 + int (*handle_events)(struct io_ring_ctx *ctx, struct iou_loop_state *stat= e); + struct io_ring_ctx *ctx; }; =20 +static inline int io_run_bpf(struct io_ring_ctx *ctx, struct iou_loop_stat= e *state) +{ + scoped_guard(mutex, &ctx->uring_lock) { + if (!ctx->bpf_ops) + return IOU_EVENTS_STOP; + return ctx->bpf_ops->handle_events(ctx, state); + } +} + static inline bool io_bpf_attached(struct io_ring_ctx *ctx) { return IS_ENABLED(CONFIG_BPF) && ctx->bpf_ops !=3D NULL; diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 8f68e898d60c..bf245be0844b 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -2540,8 +2540,13 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, i= nt min_events, u32 flags, =20 if (unlikely(test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq))) io_cqring_do_overflow_flush(ctx); - if (__io_cqring_events_user(ctx) >=3D min_events) + + if (io_bpf_attached(ctx)) { + if (ext_arg->min_time) + return -EINVAL; + } else if (__io_cqring_events_user(ctx) >=3D min_events) { return 0; + } =20 init_waitqueue_func_entry(&iowq.wq, io_wake_function); iowq.wq.private =3D current; @@ -2621,6 +2626,21 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, i= nt min_events, u32 flags, if (ret < 0) break; =20 + if (io_bpf_attached(ctx)) { + ret =3D io_run_bpf(ctx, &iowq.state); + if (ret !=3D IOU_EVENTS_WAIT) + break; + + if (unlikely(read_thread_flags())) { + if (task_sigpending(current)) { + ret =3D -EINTR; + break; + } + cond_resched(); + } + continue; + } + check_cq =3D READ_ONCE(ctx->check_cq); if (unlikely(check_cq)) { /* let the caller flush overflows, retry */ --=20 2.49.0 From nobody Fri Dec 19 19:23:40 2025 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BAAAE289E1B; Fri, 6 Jun 2025 13:56:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218212; cv=none; b=X9ij/VJ43QFH85MJa42KqbBZyRcFIKnEen4Q7FxSHr+0wZ/42TzIC4fP+H/iGZUoGh5MYmgEntfSiPDERwTOZ2aeQnvAxP3kTZnZO5AEwhjmBC121zuPesggDuXTdVidEtqIBA1IiDXqjlvhqATfKT2xR7f/VAKf6o13a7/4V1g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749218212; c=relaxed/simple; bh=6XuUXclfgoEKljVaIZCM2rGLK7pYyEG8d9ML+mCyG1I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c8sWQeZybx4SgeC5ivSoCFmT0aMIP5mPJ8lJP4NtOuVk5mDkHm3A2D0KOrYhDUzCFV+kPIKWtBrlY/LIB+xzFRJwVZ6Jp9KcuHaEDyR7YHe0RpECUbuyiUSTQgywD3oOY3CvtNYiytiOskKzt51rWB9K0ZgCyftyGl2ibuOVmAc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=DnkOUMVy; arc=none smtp.client-ip=209.85.218.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DnkOUMVy" Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-adb2bd27c7bso342193466b.2; Fri, 06 Jun 2025 06:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749218209; x=1749823009; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fvUd63Qr4Uj7TI50ZVFoqkRT4E5BfeemVFyk/+/NcyI=; b=DnkOUMVyYh1qZ84LvYOqpK1J2zChMvMwTvckKYCvF1xYzdk+2FM8KhwbRpmlJ0qWvl Vkj62lwzFc0TLLxgBgc/XTNmwqPPDgLobrQUbX9MqwgLSQeSNN7Sif9VhakHUbxg/lAw iGU43ok02uwgcrehgtbzAUdq6A5u4CdBfYWZGDozPCYG8KfKHUmd+YXQYv3qusnBp8Fj OMlBV6KViefJ4ksoW1xjC41Zq31MRLrNMbRgh35SZVjw1dHac8ixa/ChpyAPVA2XyQpy m7XJtbbxHIk3Hzx2Yz2Flw4WHnjWjJjPDPKAq+InfTR+/FZuiiCX6fNFl0xgyJStSup4 Yf+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749218209; x=1749823009; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fvUd63Qr4Uj7TI50ZVFoqkRT4E5BfeemVFyk/+/NcyI=; b=s3KQwRzx5/HLvFYNva5IZQN4dy27quNZojJWY/6XsnpmlhAOJ7UzHlmbI7VVnA1/CV u5euLbPM3qdkMOVPzuRUn2sChmLvJESbPmJEA01eGOmV1yV9qBPqR+THcBMHbDWtc/rS az4c5u3Nk7vqNO38xOOwEM3tl9adC0Luo1iBhU15nC6iCSpKSdppWFNU+Sad9LL8MY1x VupuCgF89K7vdyoTgsRShlmStBqfsUXz83wNNG0guNGYHRqyle7ihXNKK8Y131L5y4N/ vleD0czanNdVRHXFG4WtXDAe8sLzDr6SrzPF+IKWT3r0hIvJpGL6m4LqNCN+tL0qmLEW LBfA== X-Forwarded-Encrypted: i=1; AJvYcCWr4eHgZFaUGz0CBfI3AQrc7qpbbw2Z64KOJCMI7TbeOZ6X0KSRMyAtcAJv9veUiHxTYe0=@vger.kernel.org, AJvYcCXDX++ZsAA7+Pss/HOnDRSFBjtcHbcwu9fBk/BBfo4T49cH0dLfk2YjcwvIJUhyUd27aKReOVEb7LMzmGwp@vger.kernel.org X-Gm-Message-State: AOJu0Yw4ZMCFQ58tiiu9inXW0GsNrzB//gggiClnhDHvtxjfMNhq295a 1Sj5wn8ixGQL4Su6cDvbFvJQnxD731ZN4InGr37yF6/ttFjOkZkwghEXpdl65w== X-Gm-Gg: ASbGnctKo1l2EziC4vshnllekvOs1TCAgyrUWxWyLTAUhyVlBXa/xx2pRxLlZY1NspR M3tVB3iWKYCQjWW64Ay1/VBuS2ASxo+hI6dX7nNB1N65a7Y5y/a/F8kCXVnG2xMCjNL98vbbsTj laWpZ9EMn+W0ff7dscEWuVM+rcxAWTqGo8DjxjfA5XG0csvds/zQFOLvcSl4QBnVS9X9omz+Yie 2ZDqpR+fHiHQhP/7RBSpkjTB1x6vHtwu+Dn9Le+iDwSNS3E5pI7Qvn/0Y4769+n9dUWqdt+2fMv kT0oxZlb+zIcLaPjDlNftwPW62ZhqJ5UY8XxuvwOCSVjyw== X-Google-Smtp-Source: AGHT+IEsx3hUnaelYT0aEIVsQCZHO0xm/AdKQqvAEwfCIS6P16J6XswUGs8Rz+chBLEDz7p9hGnd/g== X-Received: by 2002:a17:907:5ce:b0:adb:469d:223b with SMTP id a640c23a62f3a-ade1aa469bbmr277091866b.49.1749218208386; Fri, 06 Jun 2025 06:56:48 -0700 (PDT) Received: from 127.com ([2620:10d:c092:600::1:a199]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ade1dc379f6sm118026766b.110.2025.06.06.06.56.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jun 2025 06:56:47 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: asml.silence@gmail.com, Martin KaFai Lau , bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC v2 5/5] io_uring/bpf: add basic kfunc helpers Date: Fri, 6 Jun 2025 14:58:02 +0100 Message-ID: X-Mailer: git-send-email 2.49.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A handle_events program should be able to parse the CQ and submit new requests, add kfuncs to cover that. The only essential kfunc here is bpf_io_uring_submit_sqes, and the rest are likely be removed in a non-RFC version in favour of a more general approach. Signed-off-by: Pavel Begunkov --- io_uring/bpf.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 86 insertions(+) diff --git a/io_uring/bpf.c b/io_uring/bpf.c index f86b12f280e8..9494e4289605 100644 --- a/io_uring/bpf.c +++ b/io_uring/bpf.c @@ -1,12 +1,92 @@ #include #include =20 +#include "io_uring.h" #include "bpf.h" #include "register.h" =20 static const struct btf_type *loop_state_type; DEFINE_MUTEX(io_bpf_ctrl_mutex); =20 +__bpf_kfunc_start_defs(); + +__bpf_kfunc int bpf_io_uring_submit_sqes(struct io_ring_ctx *ctx, + unsigned nr) +{ + return io_submit_sqes(ctx, nr); +} + +__bpf_kfunc int bpf_io_uring_post_cqe(struct io_ring_ctx *ctx, + u64 data, u32 res, u32 cflags) +{ + bool posted; + + posted =3D io_post_aux_cqe(ctx, data, res, cflags); + return posted ? 0 : -ENOMEM; +} + +__bpf_kfunc int bpf_io_uring_queue_sqe(struct io_ring_ctx *ctx, + void *bpf_sqe, int mem__sz) +{ + unsigned tail =3D ctx->rings->sq.tail; + struct io_uring_sqe *sqe; + + if (mem__sz !=3D sizeof(*sqe)) + return -EINVAL; + + ctx->rings->sq.tail++; + tail &=3D (ctx->sq_entries - 1); + /* double index for 128-byte SQEs, twice as long */ + if (ctx->flags & IORING_SETUP_SQE128) + tail <<=3D 1; + sqe =3D &ctx->sq_sqes[tail]; + memcpy(sqe, bpf_sqe, sizeof(*sqe)); + return 0; +} + +__bpf_kfunc +struct io_uring_cqe *bpf_io_uring_get_cqe(struct io_ring_ctx *ctx, u32 idx) +{ + unsigned max_entries =3D ctx->cq_entries; + struct io_uring_cqe *cqe_array =3D ctx->rings->cqes; + + if (ctx->flags & IORING_SETUP_CQE32) + max_entries *=3D 2; + return &cqe_array[idx & (max_entries - 1)]; +} + +__bpf_kfunc +struct io_uring_cqe *bpf_io_uring_extract_next_cqe(struct io_ring_ctx *ctx) +{ + struct io_rings *rings =3D ctx->rings; + unsigned int mask =3D ctx->cq_entries - 1; + unsigned head =3D rings->cq.head; + struct io_uring_cqe *cqe; + + /* TODO CQE32 */ + if (head =3D=3D rings->cq.tail) + return NULL; + + cqe =3D &rings->cqes[head & mask]; + rings->cq.head++; + return cqe; +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(io_uring_kfunc_set) +BTF_ID_FLAGS(func, bpf_io_uring_submit_sqes, KF_SLEEPABLE); +BTF_ID_FLAGS(func, bpf_io_uring_post_cqe, KF_SLEEPABLE); +BTF_ID_FLAGS(func, bpf_io_uring_queue_sqe, KF_SLEEPABLE); +BTF_ID_FLAGS(func, bpf_io_uring_get_cqe, 0); +BTF_ID_FLAGS(func, bpf_io_uring_extract_next_cqe, KF_RET_NULL); +BTF_KFUNCS_END(io_uring_kfunc_set) + +static const struct btf_kfunc_id_set bpf_io_uring_kfunc_set =3D { + .owner =3D THIS_MODULE, + .set =3D &io_uring_kfunc_set, +}; + static int io_bpf_ops__handle_events(struct io_ring_ctx *ctx, struct iou_loop_state *state) { @@ -186,6 +266,12 @@ static int __init io_uring_bpf_init(void) return ret; } =20 + ret =3D register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, + &bpf_io_uring_kfunc_set); + if (ret) { + pr_err("io_uring: Failed to register kfuncs (%d)\n", ret); + return ret; + } return 0; } __initcall(io_uring_bpf_init); --=20 2.49.0