block/backup.c | 118 +++++++++++++++++++++++++++++++++++++++++----- block/replication.c | 2 +- blockdev.c | 5 ++ include/block/block_int.h | 2 + qapi/block-core.json | 5 ++ 5 files changed, 118 insertions(+), 14 deletions(-)
If the backup target is a slow device like ceph rbd, the backup process will affect guest BLK write IO performance seriously, it's cause by the drawback of COW mechanism, if guest overwrite the backup BLK area, the IO can only be processed after the data has been written to backup target. The impact can be relieved by buffering data read from backup source and writing to backup target later, so the guest BLK write IO can be processed in time. Data area with no overwrite will be process like before without buffering, in most case, we don't need a very large buffer. An fio test was done when the backup was going on, the test resut show a obvious performance improvement by buffering. Test result(1GB buffer): ======================== fio setting: [random-writers] ioengine=libaio iodepth=8 rw=randwrite bs=32k direct=1 size=1G numjobs=1 result: IOPS AVG latency no backup: 19389 410 us backup: 1402 5702 us backup w/ buffer: 8684 918 us ============================================== Cc: John Snow <jsnow@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Wen Congyang <wencongyang2@huawei.com> Cc: Xie Changlong <xiechanglong.d@gmail.com> Cc: Markus Armbruster <armbru@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Fam Zheng <fam@euphon.net> Liang Li (2): backup: buffer COW request and delay the write operation qapi: add interface for setting backup cow buffer size block/backup.c | 118 +++++++++++++++++++++++++++++++++++++++++----- block/replication.c | 2 +- blockdev.c | 5 ++ include/block/block_int.h | 2 + qapi/block-core.json | 5 ++ 5 files changed, 118 insertions(+), 14 deletions(-) -- 2.14.1
28.04.2019 13:01, Liang Li wrote: > If the backup target is a slow device like ceph rbd, the backup > process will affect guest BLK write IO performance seriously, > it's cause by the drawback of COW mechanism, if guest overwrite the > backup BLK area, the IO can only be processed after the data has > been written to backup target. > The impact can be relieved by buffering data read from backup > source and writing to backup target later, so the guest BLK write > IO can be processed in time. > Data area with no overwrite will be process like before without > buffering, in most case, we don't need a very large buffer. > > An fio test was done when the backup was going on, the test resut > show a obvious performance improvement by buffering. Hi Liang! Good thing. Something like this I've briefly mentioned in my KVM Forum 2018 report as "RAM Cache", and I'd really prefer this functionality to be a separate filter, instead of complication of backup code. Further more, write notifiers will go away from backup code, after my backup-top series merged. v5: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg06211.html and separated preparing refactoring v7: https://lists.gnu.org/archive/html/qemu-devel/2019-04/msg04813.html RAM Cache should be a filter driver, with an in-memory buffer(s) for data written to it and with ability to flush data to underlying backing file. Also, here is another approach for the problem, which helps if guest writing activity is really high and long and buffer will be filled and performance will decrease anyway: 1. Create local temporary image, and COWs will go to it. (previously considered on list, that we should call these backup operations issued by guest writes CBW = copy-before-write, as copy-on-write is generally another thing, and using this term in backup is confusing). 2. We also set original disk as a backing for temporary image, and start another backup from temporary to real target. This scheme is almost possible now, you need to start backup(sync=none) from source to temp, to do [1]. Some patches are still needed to allow such scheme. I didn't send them, as I want my other backup patches go first anyway. But I can. On the other hand if approach with in-memory buffer works for you it may be better. Also, I'm not sure for now, should we really do this thing through two backup jobs, or we just need one separate backup-top filter and one backup job without filter, or we need an additional parameter for backup job to set cache-block-node. > > Test result(1GB buffer): > ======================== > fio setting: > [random-writers] > ioengine=libaio > iodepth=8 > rw=randwrite > bs=32k > direct=1 > size=1G > numjobs=1 > > result: > IOPS AVG latency > no backup: 19389 410 us > backup: 1402 5702 us > backup w/ buffer: 8684 918 us > ============================================== > > Cc: John Snow <jsnow@redhat.com> > Cc: Kevin Wolf <kwolf@redhat.com> > Cc: Max Reitz <mreitz@redhat.com> > Cc: Wen Congyang <wencongyang2@huawei.com> > Cc: Xie Changlong <xiechanglong.d@gmail.com> > Cc: Markus Armbruster <armbru@redhat.com> > Cc: Eric Blake <eblake@redhat.com> > Cc: Fam Zheng <fam@euphon.net> > > Liang Li (2): > backup: buffer COW request and delay the write operation > qapi: add interface for setting backup cow buffer size > > block/backup.c | 118 +++++++++++++++++++++++++++++++++++++++++----- > block/replication.c | 2 +- > blockdev.c | 5 ++ > include/block/block_int.h | 2 + > qapi/block-core.json | 5 ++ > 5 files changed, 118 insertions(+), 14 deletions(-) > -- Best regards, Vladimir
On Tue, Apr 30, 2019 at 10:35:32AM +0000, Vladimir Sementsov-Ogievskiy wrote: > 28.04.2019 13:01, Liang Li wrote: > > If the backup target is a slow device like ceph rbd, the backup > > process will affect guest BLK write IO performance seriously, > > it's cause by the drawback of COW mechanism, if guest overwrite the > > backup BLK area, the IO can only be processed after the data has > > been written to backup target. > > The impact can be relieved by buffering data read from backup > > source and writing to backup target later, so the guest BLK write > > IO can be processed in time. > > Data area with no overwrite will be process like before without > > buffering, in most case, we don't need a very large buffer. > > > > An fio test was done when the backup was going on, the test resut > > show a obvious performance improvement by buffering. > > Hi Liang! > > Good thing. Something like this I've briefly mentioned in my KVM Forum 2018 > report as "RAM Cache", and I'd really prefer this functionality to be a separate > filter, instead of complication of backup code. Further more, write notifiers > will go away from backup code, after my backup-top series merged. > > v5: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg06211.html > and separated preparing refactoring v7: https://lists.gnu.org/archive/html/qemu-devel/2019-04/msg04813.html > > RAM Cache should be a filter driver, with an in-memory buffer(s) for data written to it > and with ability to flush data to underlying backing file. > > Also, here is another approach for the problem, which helps if guest writing activity > is really high and long and buffer will be filled and performance will decrease anyway: > > 1. Create local temporary image, and COWs will go to it. (previously considered on list, that we should call > these backup operations issued by guest writes CBW = copy-before-write, as copy-on-write > is generally another thing, and using this term in backup is confusing). > > 2. We also set original disk as a backing for temporary image, and start another backup from > temporary to real target. > > This scheme is almost possible now, you need to start backup(sync=none) from source to temp, > to do [1]. Some patches are still needed to allow such scheme. I didn't send them, as I want > my other backup patches go first anyway. But I can. On the other hand if approach with in-memory > buffer works for you it may be better. > > Also, I'm not sure for now, should we really do this thing through two backup jobs, or we just > need one separate backup-top filter and one backup job without filter, or we need an additional > parameter for backup job to set cache-block-node. > Hi Vladimir, Thanks for your valuable information. I didn't notice that you are already working on this, so my patch will conflict with your work. We have thought about the way [2] and give it up because it would affect local storage performance. I have read your slice in KVM Forum 2018 and the related patches, your solution can help to solve the issues in backup. I am not sure if the "RAM cache" is a qcow2 file in RAM? if so, your implementation will free the RAM space occupied by BLK data once it's written to the far target in time? or we may need a large cache to make things work. Two backup jobs seems complex and not user friendly, is it possible to make my patch cowork with CBW? Liang
06.05.2019 7:24, Liang Li wrote: > On Tue, Apr 30, 2019 at 10:35:32AM +0000, Vladimir Sementsov-Ogievskiy wrote: >> 28.04.2019 13:01, Liang Li wrote: >>> If the backup target is a slow device like ceph rbd, the backup >>> process will affect guest BLK write IO performance seriously, >>> it's cause by the drawback of COW mechanism, if guest overwrite the >>> backup BLK area, the IO can only be processed after the data has >>> been written to backup target. >>> The impact can be relieved by buffering data read from backup >>> source and writing to backup target later, so the guest BLK write >>> IO can be processed in time. >>> Data area with no overwrite will be process like before without >>> buffering, in most case, we don't need a very large buffer. >>> >>> An fio test was done when the backup was going on, the test resut >>> show a obvious performance improvement by buffering. >> >> Hi Liang! >> >> Good thing. Something like this I've briefly mentioned in my KVM Forum 2018 >> report as "RAM Cache", and I'd really prefer this functionality to be a separate >> filter, instead of complication of backup code. Further more, write notifiers >> will go away from backup code, after my backup-top series merged. >> >> v5: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg06211.html >> and separated preparing refactoring v7: https://lists.gnu.org/archive/html/qemu-devel/2019-04/msg04813.html >> >> RAM Cache should be a filter driver, with an in-memory buffer(s) for data written to it >> and with ability to flush data to underlying backing file. >> >> Also, here is another approach for the problem, which helps if guest writing activity >> is really high and long and buffer will be filled and performance will decrease anyway: >> >> 1. Create local temporary image, and COWs will go to it. (previously considered on list, that we should call >> these backup operations issued by guest writes CBW = copy-before-write, as copy-on-write >> is generally another thing, and using this term in backup is confusing). >> >> 2. We also set original disk as a backing for temporary image, and start another backup from >> temporary to real target. >> >> This scheme is almost possible now, you need to start backup(sync=none) from source to temp, >> to do [1]. Some patches are still needed to allow such scheme. I didn't send them, as I want >> my other backup patches go first anyway. But I can. On the other hand if approach with in-memory >> buffer works for you it may be better. >> >> Also, I'm not sure for now, should we really do this thing through two backup jobs, or we just >> need one separate backup-top filter and one backup job without filter, or we need an additional >> parameter for backup job to set cache-block-node. >> > > Hi Vladimir, > > Thanks for your valuable information. I didn't notice that you are already working on > this, so my patch will conflict with your work. We have thought about the way [2] and > give it up because it would affect local storage performance. > I have read your slice in KVM Forum 2018 and the related patches, your solution can > help to solve the issues in backup. I am not sure if the "RAM cache" is a qcow2 file in > RAM? if so, your implementation will free the RAM space occupied by BLK data once it's > written to the far target in time? or we may need a large cache to make things work. > Two backup jobs seems complex and not user friendly, is it possible to make my patch > cowork with CBW? No, I don't think that RAM cache should be qcow2 in RAM.. What you are doing is actually caching CBW data in RAM. To do it when CBW is done in backup-top filter driver, there are two ways: 1. Do caching insided backup-top filter - it would be like your approach of caching inside CBW operations. I think this is bad way. 2. Make separate filter driver for caching - RAM Cache. Probably, it should store in-flight requests as is, just list of buffers and offsets. -- Best regards, Vladimir
© 2016 - 2024 Red Hat, Inc.