[PATCH 0/6] return an error when cqe is dropped

Dylan Yudaken posted 6 patches 4 years ago
fs/io_uring.c                   | 89 ++++++++++++++++++++++-----------
include/trace/events/io_uring.h | 42 +++++++++++++++-
2 files changed, 102 insertions(+), 29 deletions(-)
[PATCH 0/6] return an error when cqe is dropped
Posted by Dylan Yudaken 4 years ago
This series addresses a rare but real error condition when a CQE is
dropped. Many applications rely on 1 SQE resulting in 1 CQE and may even
block waiting for the CQE. In overflow conditions if the GFP_ATOMIC
allocation fails, the CQE is dropped and a counter is incremented. However
the application is not actively signalled that something bad has
happened. We would like to indicate this error condition to the
application but in a way that does not rely on the application doing
invasive changes such as checking a flag before each wait.

This series returns an error code to the application when the error hits,
and then resets the error condition. If the application is ok with this
error it can continue as is, or more likely it can clean up sanely.

Patches 1&2 add tracing for overflows
Patches 3&4 prep for adding this error
Patch 5 is the main one returning an error
Patch 6 allows liburing to test these conditions more easily with IOPOLL

Dylan Yudaken (6):
  io_uring: add trace support for CQE overflow
  io_uring: trace cqe overflows
  io_uring: rework io_uring_enter to simplify return value
  io_uring: use constants for cq_overflow bitfield
  io_uring: return an error when cqe is dropped
  io_uring: allow NOP opcode in IOPOLL mode

 fs/io_uring.c                   | 89 ++++++++++++++++++++++-----------
 include/trace/events/io_uring.h | 42 +++++++++++++++-
 2 files changed, 102 insertions(+), 29 deletions(-)


base-commit: 7c648b7d6186c59ed3a0e0ae4b774aaf4b415ef2
-- 
2.30.2
Re: [PATCH 0/6] return an error when cqe is dropped
Posted by Jens Axboe 4 years ago
On Thu, 21 Apr 2022 02:13:39 -0700, Dylan Yudaken wrote:
> This series addresses a rare but real error condition when a CQE is
> dropped. Many applications rely on 1 SQE resulting in 1 CQE and may even
> block waiting for the CQE. In overflow conditions if the GFP_ATOMIC
> allocation fails, the CQE is dropped and a counter is incremented. However
> the application is not actively signalled that something bad has
> happened. We would like to indicate this error condition to the
> application but in a way that does not rely on the application doing
> invasive changes such as checking a flag before each wait.
> 
> [...]

Applied, thanks!

[1/6] io_uring: add trace support for CQE overflow
      commit: f457ab8deb017140aef05be3027a00a18a7d16b7
[2/6] io_uring: trace cqe overflows
      commit: 2a847e6faf76810ae68a6e81bd9ac3a7c81534d0
[3/6] io_uring: rework io_uring_enter to simplify return value
      commit: db9bb58b391c9e62da68bc139598e8470d892c77
[4/6] io_uring: use constants for cq_overflow bitfield
      commit: b293240e2634b2100196d7314aeeb84299ce6d5b
[5/6] io_uring: return an error when cqe is dropped
      commit: 34a7ee8a42c8496632465f3f0b444b3a7b908c46
[6/6] io_uring: allow NOP opcode in IOPOLL mode
      commit: ebbe59f49556822b9bcc7b0d4d96bae31f522905

Best regards,
-- 
Jens Axboe