[PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism

Jiahui Cen posted 9 patches 3 years, 3 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210205101315.13042-1-cenjiahui@huawei.com
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Fam Zheng <fam@euphon.net>, Kevin Wolf <kwolf@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, John Snow <jsnow@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>, Max Reitz <mreitz@redhat.com>
block/block-backend.c          | 68 ++++++++++++++++++++
blockdev.c                     | 52 +++++++++++++++
hw/block/block.c               | 10 +++
hw/block/virtio-blk.c          | 21 +++++-
hw/scsi/scsi-bus.c             | 16 +++--
hw/scsi/scsi-disk.c            | 16 +++++
include/hw/block/block.h       |  7 +-
include/hw/scsi/scsi.h         |  1 +
include/sysemu/block-backend.h | 10 +++
qapi/block-core.json           |  9 ++-
10 files changed, 199 insertions(+), 11 deletions(-)
[PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Jiahui Cen 3 years, 3 months ago
A VM in the cloud environment may use a virutal disk as the backend storage,
and there are usually filesystems on the virtual block device. When backend
storage is temporarily down, any I/O issued to the virtual block device
will cause an error. For example, an error occurred in ext4 filesystem would
make the filesystem readonly. In production environment, a cloud backend
storage can be soon recovered. For example, an IP-SAN may be down due to
network failure and will be online soon after network is recovered. However,
the error in the filesystem may not be recovered unless a device reattach
or system restart. Thus an I/O retry mechanism is in need to implement a
self-healing system.

This patch series propose to extend the werror=/rerror= mechanism to add
a 'retry' feature. It can automatically retry failed I/O requests on error
without sending error back to guest, and guest can get back running smoothly
when I/O is recovred.

v4->v5:
* Add document for 'retry' in qapi.
* Support werror=/rerror=retry for scsi-disk.
* Pause retry when draining.

v3->v4:
* Adapt to werror=/rerror= mechanism.

v2->v3:
* Add a doc to describe I/O hang.

v1->v2:
* Rebase to fix compile problems.
* Fix incorrect remove of rehandle list.
* Provide rehandle pause interface.

REF: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06560.html

Jiahui Cen (9):
  qapi/block-core: Add retry option for error action
  block-backend: Introduce retry timer
  block-backend: Add device specific retry callback
  block-backend: Enable retry action on errors
  block-backend: Add timeout support for retry
  block: Add error retry param setting
  virtio_blk: Add support for retry on errors
  scsi-bus: Refactor the code that retries requests
  scsi-disk: Add support for retry on errors

 block/block-backend.c          | 68 ++++++++++++++++++++
 blockdev.c                     | 52 +++++++++++++++
 hw/block/block.c               | 10 +++
 hw/block/virtio-blk.c          | 21 +++++-
 hw/scsi/scsi-bus.c             | 16 +++--
 hw/scsi/scsi-disk.c            | 16 +++++
 include/hw/block/block.h       |  7 +-
 include/hw/scsi/scsi.h         |  1 +
 include/sysemu/block-backend.h | 10 +++
 qapi/block-core.json           |  9 ++-
 10 files changed, 199 insertions(+), 11 deletions(-)

-- 
2.29.2


Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Jiahui Cen 3 years, 3 months ago
Kindly ping.
Any comments and reviews are wellcome :)

Thanks,
Jiahui

On 2021/2/5 18:13, Jiahui Cen wrote:
> A VM in the cloud environment may use a virutal disk as the backend storage,
> and there are usually filesystems on the virtual block device. When backend
> storage is temporarily down, any I/O issued to the virtual block device
> will cause an error. For example, an error occurred in ext4 filesystem would
> make the filesystem readonly. In production environment, a cloud backend
> storage can be soon recovered. For example, an IP-SAN may be down due to
> network failure and will be online soon after network is recovered. However,
> the error in the filesystem may not be recovered unless a device reattach
> or system restart. Thus an I/O retry mechanism is in need to implement a
> self-healing system.
> 
> This patch series propose to extend the werror=/rerror= mechanism to add
> a 'retry' feature. It can automatically retry failed I/O requests on error
> without sending error back to guest, and guest can get back running smoothly
> when I/O is recovred.
> 
> v4->v5:
> * Add document for 'retry' in qapi.
> * Support werror=/rerror=retry for scsi-disk.
> * Pause retry when draining.
> 
> v3->v4:
> * Adapt to werror=/rerror= mechanism.
> 
> v2->v3:
> * Add a doc to describe I/O hang.
> 
> v1->v2:
> * Rebase to fix compile problems.
> * Fix incorrect remove of rehandle list.
> * Provide rehandle pause interface.
> 
> REF: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06560.html
> 
> Jiahui Cen (9):
>   qapi/block-core: Add retry option for error action
>   block-backend: Introduce retry timer
>   block-backend: Add device specific retry callback
>   block-backend: Enable retry action on errors
>   block-backend: Add timeout support for retry
>   block: Add error retry param setting
>   virtio_blk: Add support for retry on errors
>   scsi-bus: Refactor the code that retries requests
>   scsi-disk: Add support for retry on errors
> 
>  block/block-backend.c          | 68 ++++++++++++++++++++
>  blockdev.c                     | 52 +++++++++++++++
>  hw/block/block.c               | 10 +++
>  hw/block/virtio-blk.c          | 21 +++++-
>  hw/scsi/scsi-bus.c             | 16 +++--
>  hw/scsi/scsi-disk.c            | 16 +++++
>  include/hw/block/block.h       |  7 +-
>  include/hw/scsi/scsi.h         |  1 +
>  include/sysemu/block-backend.h | 10 +++
>  qapi/block-core.json           |  9 ++-
>  10 files changed, 199 insertions(+), 11 deletions(-)
> 

Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Stefan Hajnoczi 3 years, 2 months ago
On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote:
> This patch series propose to extend the werror=/rerror= mechanism to add
> a 'retry' feature. It can automatically retry failed I/O requests on error
> without sending error back to guest, and guest can get back running smoothly
> when I/O is recovred.

This patch series implements a retry followed by werror/rerror=report
after a timeout. This mechanism could be made more generic (and the code
could be simplified) by removing the new werror/rerror=retry action and
instead implementing the retry/timeout followed by *any* werror=/rerror=
policy chosen by the user.

In other words, if the retry interval is non-zero, retry the request and
check for timeouts. When the timeout is reached, obey the
werror=/rerror= action.

This is more flexible than hard-coding werror=retry to mean retry
timeout followed by werror=report.

For example:

  werror=stop,write-retry-interval=1000,write-retry-timeout=15000,
  rerror=report,read-retry-interval=1000,read-retry-timeout=15000

Failed write requests will be retried once a second for 15 seconds.
If the timeout is reached the guest is stopped.

Failed read requests will be retried once a second for 15 seconds. If
the timeout is reached the error is reported to the guest.

Stefan
Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Jiahui Cen 3 years, 2 months ago
Hi Stefan,

On 2021/2/23 17:40, Stefan Hajnoczi wrote:
> On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote:
>> This patch series propose to extend the werror=/rerror= mechanism to add
>> a 'retry' feature. It can automatically retry failed I/O requests on error
>> without sending error back to guest, and guest can get back running smoothly
>> when I/O is recovred.
> 
> This patch series implements a retry followed by werror/rerror=report
> after a timeout. This mechanism could be made more generic (and the code
> could be simplified) by removing the new werror/rerror=retry action and
> instead implementing the retry/timeout followed by *any* werror=/rerror=
> policy chosen by the user.
> 
> In other words, if the retry interval is non-zero, retry the request and
> check for timeouts. When the timeout is reached, obey the
> werror=/rerror= action.
> 
> This is more flexible than hard-coding werror=retry to mean retry
> timeout followed by werror=report.
> 
> For example:
> 
>   werror=stop,write-retry-interval=1000,write-retry-timeout=15000,
>   rerror=report,read-retry-interval=1000,read-retry-timeout=15000
> 
> Failed write requests will be retried once a second for 15 seconds.
> If the timeout is reached the guest is stopped.
> 
> Failed read requests will be retried once a second for 15 seconds. If
> the timeout is reached the error is reported to the guest.

Sounds like a better way for me. I'll refactor this patch series under
your suggestion.

Also thanks for your review.

Thanks,
Jiahui

Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Eric Blake 3 years, 2 months ago
On 2/23/21 3:40 AM, Stefan Hajnoczi wrote:
> On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote:
>> This patch series propose to extend the werror=/rerror= mechanism to add
>> a 'retry' feature. It can automatically retry failed I/O requests on error
>> without sending error back to guest, and guest can get back running smoothly
>> when I/O is recovred.
> 
> This patch series implements a retry followed by werror/rerror=report
> after a timeout. This mechanism could be made more generic (and the code
> could be simplified) by removing the new werror/rerror=retry action and
> instead implementing the retry/timeout followed by *any* werror=/rerror=
> policy chosen by the user.
> 
> In other words, if the retry interval is non-zero, retry the request and
> check for timeouts. When the timeout is reached, obey the
> werror=/rerror= action.
> 
> This is more flexible than hard-coding werror=retry to mean retry
> timeout followed by werror=report.
> 
> For example:
> 
>   werror=stop,write-retry-interval=1000,write-retry-timeout=15000,
>   rerror=report,read-retry-interval=1000,read-retry-timeout=15000
> 
> Failed write requests will be retried once a second for 15 seconds.
> If the timeout is reached the guest is stopped.
> 
> Failed read requests will be retried once a second for 15 seconds. If
> the timeout is reached the error is reported to the guest.

You may also want to look at what the NBD block device already
implements for retries, and see if making retry generic to the block
layer in general can do everything already possible in the NBD code, at
which point the NBD code can be simplified.  Vladimir (added in cc) is
the best point of contact there.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Vladimir Sementsov-Ogievskiy 3 years, 2 months ago
23.02.2021 16:41, Eric Blake wrote:
> On 2/23/21 3:40 AM, Stefan Hajnoczi wrote:
>> On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote:
>>> This patch series propose to extend the werror=/rerror= mechanism to add
>>> a 'retry' feature. It can automatically retry failed I/O requests on error
>>> without sending error back to guest, and guest can get back running smoothly
>>> when I/O is recovred.
>>
>> This patch series implements a retry followed by werror/rerror=report
>> after a timeout. This mechanism could be made more generic (and the code
>> could be simplified) by removing the new werror/rerror=retry action and
>> instead implementing the retry/timeout followed by *any* werror=/rerror=
>> policy chosen by the user.
>>
>> In other words, if the retry interval is non-zero, retry the request and
>> check for timeouts. When the timeout is reached, obey the
>> werror=/rerror= action.
>>
>> This is more flexible than hard-coding werror=retry to mean retry
>> timeout followed by werror=report.
>>
>> For example:
>>
>>    werror=stop,write-retry-interval=1000,write-retry-timeout=15000,
>>    rerror=report,read-retry-interval=1000,read-retry-timeout=15000
>>
>> Failed write requests will be retried once a second for 15 seconds.
>> If the timeout is reached the guest is stopped.
>>
>> Failed read requests will be retried once a second for 15 seconds. If
>> the timeout is reached the error is reported to the guest.
> 
> You may also want to look at what the NBD block device already
> implements for retries, and see if making retry generic to the block
> layer in general can do everything already possible in the NBD code, at
> which point the NBD code can be simplified.  Vladimir (added in cc) is
> the best point of contact there.
> 

Hi!

There are some specific things for retrying in NBD client code, that may not make sense for generic code:

We try to handle not some generic error but tcp connection problems. So

1. We NBD retries we wanted to retry only requests failed due to connection loss.
    So, if NBD server successfully reply with -EIO to client request, we don't retry and just report -EIO.
    Such things can be distinguished only inside NBD driver.

2. Timeout is specific too: we start timer when we detected the connection lost for the first time.
    During the specified time all current requests are waiting for reconnect, they'll be retried if
    connection established.
    On timeout all current requests (waiting for reconnect) will report failure, as well as all further
    requests will report failure immediately (until successful reconnect).

Still, of course, NBD driver is over-complicated by reconnect feature (especially with reconnect thread:) and
it would be cool to split some parts to be separate...


-- 
Best regards,
Vladimir

Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism
Posted by Stefan Hajnoczi 3 years, 2 months ago
On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote:
> This patch series propose to extend the werror=/rerror= mechanism to add
> a 'retry' feature. It can automatically retry failed I/O requests on error
> without sending error back to guest, and guest can get back running smoothly
> when I/O is recovred.

I posted a few questions but overall this looks promising. Thanks!

Stefan