block/vdi.c | 3 ++- block/vmdk.c | 3 +++ block/vpc.c | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-)
Hi, 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it would only go down to the protocol layer if the format layer returned BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient information whether a given range in the image is zero or not. Generally, this is because the image is preallocated and thus all ranges appear as zeroes. However, it only implemented this preallocation detection for qcow2. There are more formats that support preallocation, though: vdi, vhdx, vmdk, vpc. (Funny how they all start with “v”.) For vdi, vmdk, and vpc, the fix is rather simple, because they really have different subformats depending on whether an image is preallocated or not. This makes the check very simple. vhdx is more like qcow2, where after the image has been created, it isn’t clear whether it’s been preallocated or everything is allocated because everything was already written to. 69f47505ee added a heuristic to qcow2 to get around this, but I think that’s too much for vhdx. I just left it unfixed, because I don’t care that much, honestly (and I don’t think anyone else does). Max Reitz (3): vdi: Make block_status recurse for fixed images vmdk: Make block_status recurse for flat extents vpc: Do not return RAW from block_status block/vdi.c | 3 ++- block/vmdk.c | 3 +++ block/vpc.c | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-) -- 2.21.0
On 7/25/19 11:55 AM, Max Reitz wrote: > Hi, > > 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it > would only go down to the protocol layer if the format layer returned > BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient > information whether a given range in the image is zero or not. > Generally, this is because the image is preallocated and thus all ranges > appear as zeroes. > > However, it only implemented this preallocation detection for qcow2. > There are more formats that support preallocation, though: vdi, vhdx, > vmdk, vpc. (Funny how they all start with “v”.) > > For vdi, vmdk, and vpc, the fix is rather simple, because they really > have different subformats depending on whether an image is preallocated > or not. This makes the check very simple. > > vhdx is more like qcow2, where after the image has been created, it > isn’t clear whether it’s been preallocated or everything is allocated > because everything was already written to. 69f47505ee added a heuristic > to qcow2 to get around this, but I think that’s too much for vhdx. I > just left it unfixed, because I don’t care that much, honestly (and I > don’t think anyone else does). > What's the practical outcome of that, and is the limitation documented somewhere? (I'm fine with not fixing it, I just want it documented somehow.) > > Max Reitz (3): > vdi: Make block_status recurse for fixed images > vmdk: Make block_status recurse for flat extents > vpc: Do not return RAW from block_status > > block/vdi.c | 3 ++- > block/vmdk.c | 3 +++ > block/vpc.c | 2 +- > 3 files changed, 6 insertions(+), 2 deletions(-) >
On 12.08.19 20:39, John Snow wrote: > > > On 7/25/19 11:55 AM, Max Reitz wrote: >> Hi, >> >> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it >> would only go down to the protocol layer if the format layer returned >> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient >> information whether a given range in the image is zero or not. >> Generally, this is because the image is preallocated and thus all ranges >> appear as zeroes. >> >> However, it only implemented this preallocation detection for qcow2. >> There are more formats that support preallocation, though: vdi, vhdx, >> vmdk, vpc. (Funny how they all start with “v”.) >> >> For vdi, vmdk, and vpc, the fix is rather simple, because they really >> have different subformats depending on whether an image is preallocated >> or not. This makes the check very simple. >> >> vhdx is more like qcow2, where after the image has been created, it >> isn’t clear whether it’s been preallocated or everything is allocated >> because everything was already written to. 69f47505ee added a heuristic >> to qcow2 to get around this, but I think that’s too much for vhdx. I >> just left it unfixed, because I don’t care that much, honestly (and I >> don’t think anyone else does). >> > > What's the practical outcome of that, and is the limitation documented > somewhere? The outcome is that it if you preallocate a vhdx image (subformat=fixed), you’ll see that all sectors contain data, even if they may be zero sectors on the filesystem level. I don’t think it’s user-visible whatsoever. > (I'm fine with not fixing it, I just want it documented somehow.) I am really not inclined to start any documentation on the particularities with which qemu handles vhdx images. (Especially so considering we don’t even have any documentation on the qcow2 case. The stress in my paragraph was “heuristic”. If you preallocate a qcow2 image, but then discard enough sectors that the heuristic thinks you didn’t, you’ll have the same effect. Or if you grow a preallocated image without preallocating the new area.) Max >> >> Max Reitz (3): >> vdi: Make block_status recurse for fixed images >> vmdk: Make block_status recurse for flat extents >> vpc: Do not return RAW from block_status >> >> block/vdi.c | 3 ++- >> block/vmdk.c | 3 +++ >> block/vpc.c | 2 +- >> 3 files changed, 6 insertions(+), 2 deletions(-) >>
On 8/12/19 3:11 PM, Max Reitz wrote: > On 12.08.19 20:39, John Snow wrote: >> >> >> On 7/25/19 11:55 AM, Max Reitz wrote: >>> Hi, >>> >>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it >>> would only go down to the protocol layer if the format layer returned >>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient >>> information whether a given range in the image is zero or not. >>> Generally, this is because the image is preallocated and thus all ranges >>> appear as zeroes. >>> >>> However, it only implemented this preallocation detection for qcow2. >>> There are more formats that support preallocation, though: vdi, vhdx, >>> vmdk, vpc. (Funny how they all start with “v”.) >>> >>> For vdi, vmdk, and vpc, the fix is rather simple, because they really >>> have different subformats depending on whether an image is preallocated >>> or not. This makes the check very simple. >>> >>> vhdx is more like qcow2, where after the image has been created, it >>> isn’t clear whether it’s been preallocated or everything is allocated >>> because everything was already written to. 69f47505ee added a heuristic >>> to qcow2 to get around this, but I think that’s too much for vhdx. I >>> just left it unfixed, because I don’t care that much, honestly (and I >>> don’t think anyone else does). >>> >> >> What's the practical outcome of that, and is the limitation documented >> somewhere? > > The outcome is that it if you preallocate a vhdx image > (subformat=fixed), you’ll see that all sectors contain data, even if > they may be zero sectors on the filesystem level. > > I don’t think it’s user-visible whatsoever. > But it might mean that doing things with sync=top might over-allocate data depending on the destination, wouldn't it? That's not crucial, but it's possibly visible, no? >> (I'm fine with not fixing it, I just want it documented somehow.) > > I am really not inclined to start any documentation on the > particularities with which qemu handles vhdx images. > > (Especially so considering we don’t even have any documentation on the > qcow2 case. The stress in my paragraph was “heuristic”. If you > preallocate a qcow2 image, but then discard enough sectors that the > heuristic thinks you didn’t, you’ll have the same effect. Or if you > grow a preallocated image without preallocating the new area.) > > Max > "But our qcow2 docs are also bad" is the kind of argument I can't *really* disagree with, but... (I wish we did have a documentation manual per-format that mentioned some gotchas and general info about each format, but I can't really ask you to do that now: I just worry when I see patches like this that the knowledge or memory that there ever was a quirk will vanish immediately.) --js
On 12.08.19 23:45, John Snow wrote: > > > On 8/12/19 3:11 PM, Max Reitz wrote: >> On 12.08.19 20:39, John Snow wrote: >>> >>> >>> On 7/25/19 11:55 AM, Max Reitz wrote: >>>> Hi, >>>> >>>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it >>>> would only go down to the protocol layer if the format layer returned >>>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient >>>> information whether a given range in the image is zero or not. >>>> Generally, this is because the image is preallocated and thus all ranges >>>> appear as zeroes. >>>> >>>> However, it only implemented this preallocation detection for qcow2. >>>> There are more formats that support preallocation, though: vdi, vhdx, >>>> vmdk, vpc. (Funny how they all start with “v”.) >>>> >>>> For vdi, vmdk, and vpc, the fix is rather simple, because they really >>>> have different subformats depending on whether an image is preallocated >>>> or not. This makes the check very simple. >>>> >>>> vhdx is more like qcow2, where after the image has been created, it >>>> isn’t clear whether it’s been preallocated or everything is allocated >>>> because everything was already written to. 69f47505ee added a heuristic >>>> to qcow2 to get around this, but I think that’s too much for vhdx. I >>>> just left it unfixed, because I don’t care that much, honestly (and I >>>> don’t think anyone else does). >>>> >>> >>> What's the practical outcome of that, and is the limitation documented >>> somewhere? >> >> The outcome is that it if you preallocate a vhdx image >> (subformat=fixed), you’ll see that all sectors contain data, even if >> they may be zero sectors on the filesystem level. >> >> I don’t think it’s user-visible whatsoever. >> > > But it might mean that doing things with sync=top might over-allocate > data depending on the destination, wouldn't it? > > That's not crucial, but it's possibly visible, no? I don’t think it has anything to do with sync=top because whether a block is zero on the protocol level has nothing to do with whether it is allocated on the format level. It may make a difference for convert which uses block_status to inquire the zero status. However, it also does zero-detection, so... >>> (I'm fine with not fixing it, I just want it documented somehow.) >> >> I am really not inclined to start any documentation on the >> particularities with which qemu handles vhdx images. >> >> (Especially so considering we don’t even have any documentation on the >> qcow2 case. The stress in my paragraph was “heuristic”. If you >> preallocate a qcow2 image, but then discard enough sectors that the >> heuristic thinks you didn’t, you’ll have the same effect. Or if you >> grow a preallocated image without preallocating the new area.) >> >> Max >> > > "But our qcow2 docs are also bad" is the kind of argument I can't > *really* disagree with, but... My main argument is that nobody would read the vhdx docs anyway. Max > (I wish we did have a documentation manual per-format that mentioned > some gotchas and general info about each format, but I can't really ask > you to do that now: I just worry when I see patches like this that the > knowledge or memory that there ever was a quirk will vanish immediately.) > > --js >
On 8/13/19 10:48 AM, Max Reitz wrote: > On 12.08.19 23:45, John Snow wrote: >> >> >> On 8/12/19 3:11 PM, Max Reitz wrote: >>> On 12.08.19 20:39, John Snow wrote: >>>> >>>> >>>> On 7/25/19 11:55 AM, Max Reitz wrote: >>>>> Hi, >>>>> >>>>> 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it >>>>> would only go down to the protocol layer if the format layer returned >>>>> BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient >>>>> information whether a given range in the image is zero or not. >>>>> Generally, this is because the image is preallocated and thus all ranges >>>>> appear as zeroes. >>>>> >>>>> However, it only implemented this preallocation detection for qcow2. >>>>> There are more formats that support preallocation, though: vdi, vhdx, >>>>> vmdk, vpc. (Funny how they all start with “v”.) >>>>> >>>>> For vdi, vmdk, and vpc, the fix is rather simple, because they really >>>>> have different subformats depending on whether an image is preallocated >>>>> or not. This makes the check very simple. >>>>> >>>>> vhdx is more like qcow2, where after the image has been created, it >>>>> isn’t clear whether it’s been preallocated or everything is allocated >>>>> because everything was already written to. 69f47505ee added a heuristic >>>>> to qcow2 to get around this, but I think that’s too much for vhdx. I >>>>> just left it unfixed, because I don’t care that much, honestly (and I >>>>> don’t think anyone else does). >>>>> >>>> >>>> What's the practical outcome of that, and is the limitation documented >>>> somewhere? >>> >>> The outcome is that it if you preallocate a vhdx image >>> (subformat=fixed), you’ll see that all sectors contain data, even if >>> they may be zero sectors on the filesystem level. >>> >>> I don’t think it’s user-visible whatsoever. >>> >> >> But it might mean that doing things with sync=top might over-allocate >> data depending on the destination, wouldn't it? >> >> That's not crucial, but it's possibly visible, no? > > I don’t think it has anything to do with sync=top because whether a > block is zero on the protocol level has nothing to do with whether it is > allocated on the format level. > > It may make a difference for convert which uses block_status to inquire > the zero status. However, it also does zero-detection, so... > Oh, okay then. Probably... fine, but I have a nagging doubt relating to some of the fallbacks in e.g. qcow2 that tend to inflate zeroes in some cases (or used to. Maybe it's been fixed since.) ...but I can't point to anything, so it's fine, and I'm just drawing things out for no reason. Reviewed-by: John Snow <jsnow@redhat.com> >>>> (I'm fine with not fixing it, I just want it documented somehow.) >>> >>> I am really not inclined to start any documentation on the >>> particularities with which qemu handles vhdx images. >>> >>> (Especially so considering we don’t even have any documentation on the >>> qcow2 case. The stress in my paragraph was “heuristic”. If you >>> preallocate a qcow2 image, but then discard enough sectors that the >>> heuristic thinks you didn’t, you’ll have the same effect. Or if you >>> grow a preallocated image without preallocating the new area.) >>> >>> Max >>> >> >> "But our qcow2 docs are also bad" is the kind of argument I can't >> *really* disagree with, but... > > My main argument is that nobody would read the vhdx docs anyway. > > Max > That's the sort of thing I'd like to change, but I guess I haven't really made good on that desire in any way, so what good is that? --js
On 25.07.19 17:55, Max Reitz wrote: > Hi, > > 69f47505ee66afaa513305de0c1895a224e52c45 changed block_status so that it > would only go down to the protocol layer if the format layer returned > BDRV_BLOCK_RECURSE, thus indicating that it has no sufficient > information whether a given range in the image is zero or not. > Generally, this is because the image is preallocated and thus all ranges > appear as zeroes. > > However, it only implemented this preallocation detection for qcow2. > There are more formats that support preallocation, though: vdi, vhdx, > vmdk, vpc. (Funny how they all start with “v”.) > > For vdi, vmdk, and vpc, the fix is rather simple, because they really > have different subformats depending on whether an image is preallocated > or not. This makes the check very simple. > > vhdx is more like qcow2, where after the image has been created, it > isn’t clear whether it’s been preallocated or everything is allocated > because everything was already written to. 69f47505ee added a heuristic > to qcow2 to get around this, but I think that’s too much for vhdx. I > just left it unfixed, because I don’t care that much, honestly (and I > don’t think anyone else does). > > > Max Reitz (3): > vdi: Make block_status recurse for fixed images > vmdk: Make block_status recurse for flat extents > vpc: Do not return RAW from block_status > > block/vdi.c | 3 ++- > block/vmdk.c | 3 +++ > block/vpc.c | 2 +- > 3 files changed, 6 insertions(+), 2 deletions(-) Thanks for the reviews, applied to my block-next branch: https://git.xanclic.moe/XanClic/qemu/commits/branch/block-next Max
© 2016 - 2024 Red Hat, Inc.