01.09.2020 18:07, Max Reitz wrote:
> On 27.08.20 23:08, Vladimir Sementsov-Ogievskiy wrote:
>> 21.08.2020 17:11, Vladimir Sementsov-Ogievskiy wrote:
>>> Hi all!
>>>
>>> Here is a filter, which does preallocation on write.
>>>
>>> In Virtuozzo we have to deal with some custom distributed storage
>>> solution, where allocation is relatively expensive operation. We have to
>>> workaround it in Qemu, so here is a new filter.
>>
>> I have a problem now with this thing.
>>
>> We need preallocation. But we don't want to explicitly specify it in all
>> the management tools.
>
> Why?
>
>> So it should be inserted by default.
>
> Why? You mean without any option? That seems... Interesting?
>
> (Also like a recipe for reports of performance regression in some cases.)
>
>> It's OK for
>> us to keep this default different from upstream... But there are
>> problems with the implicitly inserted filter (actually iotests fail and
>> I failed to fix them)
>
> I would suspect even if the iotests passed we would end up with a heap
> of problems that we would only notice at some later point. I thought
> you too weren’t too fond of the idea of implicit filters.
>
>> 1. I have to set bs->inherits_from for filter and it's child by hand
>> after bdrv_replace_node(), otherwise bdrv_check_perm doesn't work.
>>
>> 2. I have to set filter_bs->implicit and teach bdrv_refresh_filename()
>> to ignore implicit filters when it checks for drv->bdrv_file_open, to
>> avoid appearing of json in backing file names
>>
>> 3. And the real design problem, which seems impossible to fix: reopen is
>> broken, just because user is not prepared to the fact that file child is
>> a filter, not a file node and has another options, and don't support
>> options of file-posix.
>
> Well, what should I say. I feel like we have made efforts in the past
> years to make the block graph fully visible to users and yield the
> responsibility of managing it to the users, too, so I’m not surprised if
> a step backwards breaks that.
>
>> And seems all it (and mostly [3]) shows that implicitly inserting the
>> filter is near to be impossible..
>>
>> So, what are possible solutions?
>>
>> In virtuozzo7 we have preallocation feature done inside qcow2 driver.
>> This is very uncomfortable: we should to handle each possible over-EOF
>> write to underlying node (to keep data_end in sync to be able to shrink
>> preallocation on close()).. I don't like this way and don't want to port
>> it..
>>
>> Another option is implementing preallocation inside file-posix driver.
>> Then, instead of BDRV_REQ_NO_WAIT flag I'll need to extend serialising
>> requests API (bdrv_make_request_serialising() is already used in
>> file-posix.c) to dupport no-wait behavior + expanding the serialising
>> request bounds. This option seems feasible, so I'll try this way if no
>> other ideas.
>
> Possible, but you haven’t yet explained what the problem with the
> management layer inserting the preallocation filter is.
>
>> Filter is obviously the true way: we use generic block layer for native
>> request serialising, don't need to catch every write in qcow2 driver,
>> don't need to modify any other driver and get a universal thing. But how
>> to insert it implicitly (or at least automatically in some cases) and
>> avoid all the problems?
>
> I don’t understand why inserting it implicitly is important.
>
You are right. Thanks for strong point of view, this makes me to revise my own. Now I'm working on v6.
--
Best regards,
Vladimir