This API can be used to tell the other side of the stream to skip
some bytes in the stream. This can be used to create a sparse
file on the receiving side of a stream.
It takes just one argument @length, which says how big the hole
is. Since our streams are not rewindable like regular files, we
don't need @whence argument like seek(2) has.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
include/libvirt/libvirt-stream.h | 3 +++
src/driver-stream.h | 5 ++++
src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++
src/libvirt_public.syms | 1 +
4 files changed, 66 insertions(+)
diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h
index bee2516..4e0a599 100644
--- a/include/libvirt/libvirt-stream.h
+++ b/include/libvirt/libvirt-stream.h
@@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st,
size_t nbytes,
unsigned int flags);
+int virStreamSkip(virStreamPtr st,
+ unsigned long long length);
+
/**
* virStreamSourceFunc:
diff --git a/src/driver-stream.h b/src/driver-stream.h
index d4b0480..20ea13f 100644
--- a/src/driver-stream.h
+++ b/src/driver-stream.h
@@ -42,6 +42,10 @@ typedef int
unsigned int flags);
typedef int
+(*virDrvStreamSkip)(virStreamPtr st,
+ unsigned long long length);
+
+typedef int
(*virDrvStreamEventAddCallback)(virStreamPtr stream,
int events,
virStreamEventCallback cb,
@@ -68,6 +72,7 @@ struct _virStreamDriver {
virDrvStreamSend streamSend;
virDrvStreamRecv streamRecv;
virDrvStreamRecvFlags streamRecvFlags;
+ virDrvStreamSkip streamSkip;
virDrvStreamEventAddCallback streamEventAddCallback;
virDrvStreamEventUpdateCallback streamEventUpdateCallback;
virDrvStreamEventRemoveCallback streamEventRemoveCallback;
diff --git a/src/libvirt-stream.c b/src/libvirt-stream.c
index 80b2d47..55f3ef5 100644
--- a/src/libvirt-stream.c
+++ b/src/libvirt-stream.c
@@ -346,6 +346,63 @@ virStreamRecvFlags(virStreamPtr stream,
/**
+ * virStreamSkip:
+ * @stream: pointer to the stream object
+ * @length: number of bytes to skip
+ *
+ * Skip @length bytes in the stream. This is useful when there's
+ * no actual data in the stream, just a hole. If that's the case,
+ * this API can be used to skip the hole properly instead of
+ * transmitting zeroes to the other side.
+ *
+ * An example using this with a hypothetical file upload API
+ * looks like:
+ *
+ * virStream st;
+ *
+ * while (1) {
+ * char buf[4096];
+ * size_t len;
+ * if (..in hole...) {
+ * ..get hole size...
+ * virStreamSkip(st, len);
+ * } else {
+ * ...read len bytes...
+ * virStreamSend(st, buf, len);
+ * }
+ * }
+ *
+ * Returns 0 on success,
+ * -1 error
+ */
+int
+virStreamSkip(virStreamPtr stream,
+ unsigned long long length)
+{
+ VIR_DEBUG("stream=%p, length=%llu", stream, length);
+
+ virResetLastError();
+
+ virCheckStreamReturn(stream, -1);
+
+ if (stream->driver &&
+ stream->driver->streamSkip) {
+ int ret;
+ ret = (stream->driver->streamSkip)(stream, length);
+ if (ret < 0)
+ goto error;
+ return ret;
+ }
+
+ virReportUnsupportedError();
+
+ error:
+ virDispatchError(stream->conn);
+ return -1;
+}
+
+
+/**
* virStreamSendAll:
* @stream: pointer to the stream object
* @handler: source callback for reading data from application
diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms
index af863bb..acadda8 100644
--- a/src/libvirt_public.syms
+++ b/src/libvirt_public.syms
@@ -762,6 +762,7 @@ LIBVIRT_3.1.0 {
LIBVIRT_3.3.0 {
global:
virStreamRecvFlags;
+ virStreamSkip;
} LIBVIRT_3.1.0;
# .... define new API here using predicted next version number ....
--
2.10.2
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On 04/20/2017 06:01 AM, Michal Privoznik wrote:
> This API can be used to tell the other side of the stream to skip
s/can be/is (unless it can be used for something else ;-))
> some bytes in the stream. This can be used to create a sparse
> file on the receiving side of a stream.
>
> It takes just one argument @length, which says how big the hole
> is. Since our streams are not rewindable like regular files, we
> don't need @whence argument like seek(2) has.
lseek is an implementation detail... However, it could be stated that
the skipping would be from the current point in the file forward by some
number of bytes. It's expected to be used in conjunction with code that
is copying over the real (or non-zero) data and should be considered an
optimization over sending zere data segments.
>
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
> include/libvirt/libvirt-stream.h | 3 +++
> src/driver-stream.h | 5 ++++
> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++
> src/libvirt_public.syms | 1 +
> 4 files changed, 66 insertions(+)
>
While it would be unused for now, should @flags be added. Who knows
what use it could have, but avoids a new Flags API, but does cause a few
other wording changes here.
Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward.
Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or
s/Skip/HoleSize/ - ewww). Names would then follow our more recent
function naming guidelines. I think I dislike the HoleSize much more
than the Skip.
> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h
> index bee2516..4e0a599 100644
> --- a/include/libvirt/libvirt-stream.h
> +++ b/include/libvirt/libvirt-stream.h
> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st,
> size_t nbytes,
> unsigned int flags);
>
> +int virStreamSkip(virStreamPtr st,
> + unsigned long long length);
Was there consideration for using 'off_t' instead of ULL? I know it's an
implementation detail of virFDStreamData and lseek() usage, but it does
hide things... IDC either way.
> >
> /**
> * virStreamSourceFunc:
> diff --git a/src/driver-stream.h b/src/driver-stream.h
> index d4b0480..20ea13f 100644
> --- a/src/driver-stream.h
> +++ b/src/driver-stream.h
> @@ -42,6 +42,10 @@ typedef int
> unsigned int flags);
>
> typedef int
> +(*virDrvStreamSkip)(virStreamPtr st,
> + unsigned long long length);
> +
> +typedef int
> (*virDrvStreamEventAddCallback)(virStreamPtr stream,
> int events,
> virStreamEventCallback cb,
> @@ -68,6 +72,7 @@ struct _virStreamDriver {
> virDrvStreamSend streamSend;
> virDrvStreamRecv streamRecv;
> virDrvStreamRecvFlags streamRecvFlags;
> + virDrvStreamSkip streamSkip;
> virDrvStreamEventAddCallback streamEventAddCallback;
> virDrvStreamEventUpdateCallback streamEventUpdateCallback;
> virDrvStreamEventRemoveCallback streamEventRemoveCallback;
> diff --git a/src/libvirt-stream.c b/src/libvirt-stream.c
> index 80b2d47..55f3ef5 100644
> --- a/src/libvirt-stream.c
> +++ b/src/libvirt-stream.c
> @@ -346,6 +346,63 @@ virStreamRecvFlags(virStreamPtr stream,
>
>
> /**
> + * virStreamSkip:
> + * @stream: pointer to the stream object
> + * @length: number of bytes to skip
> + *
> + * Skip @length bytes in the stream. This is useful when there's
> + * no actual data in the stream, just a hole. If that's the case,
> + * this API can be used to skip the hole properly instead of
> + * transmitting zeroes to the other side.
Consider something like:
Rather than transmitting empty file space, this API directs the @stream
target to create @length bytes of empty space. This API would be used
when uploading or downloading sparsely populated files to avoid the
needless copy of empty file space.
Although I suppose there could be more applications than file targets.
I'd say ACK, but I think the function name discussion needs to be
complete first...
John
> + *
> + * An example using this with a hypothetical file upload API
> + * looks like:
> + *
> + * virStream st;
> + *
> + * while (1) {
> + * char buf[4096];
> + * size_t len;
> + * if (..in hole...) {
> + * ..get hole size...
> + * virStreamSkip(st, len);
> + * } else {
> + * ...read len bytes...
> + * virStreamSend(st, buf, len);
> + * }
> + * }
> + *
> + * Returns 0 on success,
> + * -1 error
> + */
> +int
> +virStreamSkip(virStreamPtr stream,
> + unsigned long long length)
> +{
> + VIR_DEBUG("stream=%p, length=%llu", stream, length);
> +
> + virResetLastError();
> +
> + virCheckStreamReturn(stream, -1);
> +
> + if (stream->driver &&
> + stream->driver->streamSkip) {
> + int ret;
> + ret = (stream->driver->streamSkip)(stream, length);
> + if (ret < 0)
> + goto error;
> + return ret;
> + }
> +
> + virReportUnsupportedError();
> +
> + error:
> + virDispatchError(stream->conn);
> + return -1;
> +}
> +
> +
> +/**
> * virStreamSendAll:
> * @stream: pointer to the stream object
> * @handler: source callback for reading data from application
> diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms
> index af863bb..acadda8 100644
> --- a/src/libvirt_public.syms
> +++ b/src/libvirt_public.syms
> @@ -762,6 +762,7 @@ LIBVIRT_3.1.0 {
> LIBVIRT_3.3.0 {
> global:
> virStreamRecvFlags;
> + virStreamSkip;
> } LIBVIRT_3.1.0;
>
> # .... define new API here using predicted next version number ....
>
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On 05/04/2017 11:29 PM, John Ferlan wrote: > > > On 04/20/2017 06:01 AM, Michal Privoznik wrote: >> This API can be used to tell the other side of the stream to skip > > s/can be/is (unless it can be used for something else ;-)) > >> some bytes in the stream. This can be used to create a sparse >> file on the receiving side of a stream. >> >> It takes just one argument @length, which says how big the hole >> is. Since our streams are not rewindable like regular files, we >> don't need @whence argument like seek(2) has. > > lseek is an implementation detail... However, it could be stated that > the skipping would be from the current point in the file forward by some > number of bytes. It's expected to be used in conjunction with code that > is copying over the real (or non-zero) data and should be considered an > optimization over sending zere data segments. > >> >> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >> --- >> include/libvirt/libvirt-stream.h | 3 +++ >> src/driver-stream.h | 5 ++++ >> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ >> src/libvirt_public.syms | 1 + >> 4 files changed, 66 insertions(+) >> > > While it would be unused for now, should @flags be added. Who knows > what use it could have, but avoids a new Flags API, but does cause a few > other wording changes here. Ah sure. We should have @flags there. Good point. > > Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. > Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or > s/Skip/HoleSize/ - ewww). Names would then follow our more recent > function naming guidelines. I think I dislike the HoleSize much more > than the Skip. SetSkip and GetSkip sound wrong to me instead :D > >> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h >> index bee2516..4e0a599 100644 >> --- a/include/libvirt/libvirt-stream.h >> +++ b/include/libvirt/libvirt-stream.h >> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, >> size_t nbytes, >> unsigned int flags); >> >> +int virStreamSkip(virStreamPtr st, >> + unsigned long long length); > > Was there consideration for using 'off_t' instead of ULL? I know it's an > implementation detail of virFDStreamData and lseek() usage, but it does > hide things... IDC either way. The problem with off_t is that it is signed type, while ULL is unsigned. There's not much point in sending a negative offset, is there? Moreover, we use ULL for arguments like offset (not sure really why). Frankly, I don't really know why. Perhaps some types don't exist everywhere? Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 05/05/2017 07:25 AM, Michal Privoznik wrote:
> On 05/04/2017 11:29 PM, John Ferlan wrote:
>>
>>
>> On 04/20/2017 06:01 AM, Michal Privoznik wrote:
>>> This API can be used to tell the other side of the stream to skip
>>
>> s/can be/is (unless it can be used for something else ;-))
>>
>>> some bytes in the stream. This can be used to create a sparse
>>> file on the receiving side of a stream.
>>>
>>> It takes just one argument @length, which says how big the hole
>>> is. Since our streams are not rewindable like regular files, we
>>> don't need @whence argument like seek(2) has.
>>
>> lseek is an implementation detail... However, it could be stated that
>> the skipping would be from the current point in the file forward by some
>> number of bytes. It's expected to be used in conjunction with code that
>> is copying over the real (or non-zero) data and should be considered an
>> optimization over sending zere data segments.
>>
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>>> ---
>>> include/libvirt/libvirt-stream.h | 3 +++
>>> src/driver-stream.h | 5 ++++
>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++
>>> src/libvirt_public.syms | 1 +
>>> 4 files changed, 66 insertions(+)
>>>
>>
>> While it would be unused for now, should @flags be added. Who knows
>> what use it could have, but avoids a new Flags API, but does cause a few
>> other wording changes here.
>
> Ah sure. We should have @flags there. Good point.
>
>>
>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward.
>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or
>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent
>> function naming guidelines. I think I dislike the HoleSize much more
>> than the Skip.
>
> SetSkip and GetSkip sound wrong to me instead :D
>
I understand completely (in more ways than one)! However, I still have
the hacking/coding style guide to fallback upon that requires certain
syntax ;-)... While Skip "qualifies" as a verb and perhaps squeaks by on
that technicality - the "HoleSize" has no verb telling me what HoleSize
is doing. The dichotomy that "Skip" doesn't have a Size object, but Hole
does isn't lost either. I actually think Size is a "given", but as
someone who has trouble creating names that pass muster when my code is
reviewed - who am I to give too much advice here!
Also now that I'm further down the road - I keep having to attempt to
remind myself whether this is a we're setting/performing the set skip
operation or this is a we're getting the set hole size operation.
Function, RPC, structures, etc. names would help unmuddle things.
>>
>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h
>>> index bee2516..4e0a599 100644
>>> --- a/include/libvirt/libvirt-stream.h
>>> +++ b/include/libvirt/libvirt-stream.h
>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st,
>>> size_t nbytes,
>>> unsigned int flags);
>>>
>>> +int virStreamSkip(virStreamPtr st,
>>> + unsigned long long length);
>>
>> Was there consideration for using 'off_t' instead of ULL? I know it's an
>> implementation detail of virFDStreamData and lseek() usage, but it does
>> hide things... IDC either way.
>
> The problem with off_t is that it is signed type, while ULL is unsigned.
> There's not much point in sending a negative offset, is there?
> Moreover, we use ULL for arguments like offset (not sure really why).
> Frankly, I don't really know why. Perhaps some types don't exist everywhere?
>
> Michal
>
The thing is the implementation uses a function that expects an off_t.
When I see ULL I'm usually considering memory sizes which can be large.
Not that a file couldn't be, but it usually isn't.
The concern is - could we run into a situation where a coding error
somewhere supplies a negative value that will now appear to be some
inordinately large positive value; whereas, we could avoid that by
stipulating 'virStreamSkip' (or whatever it gets called) expects a
positive and/or greater than 0 value (e.g. virCheckPositiveArgGoto).
Also as I see it, virFileInData calculates *length from two off_t's and
in none of those calculations is it possible to generate a negative
number. If we did, something is wrong. Still since virStreamSkip can be
called and not necessarily use that calculation.
I'll also point out the coding example provided for virStreamSkip:
+ * while (1) {
+ * char buf[4096];
+ * size_t len; <=====
+ * if (..in hole...) {
+ * ..get hole size...
+ * virStreamSkip(st, len);
^^^^ <=== Not a ULL...
Although the corollary did use the ULL for virStreamHoleSize in its example.
Similar argument for virStreamHoleSize could be made - we shouldn't
receive a negative value, but we have no way of differentiating a coding
mistake and an inordinately large value.
At least by enforcing usage of 'off_t' we're saying - go forward...
John
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: > On 05/04/2017 11:29 PM, John Ferlan wrote: > > > > > > On 04/20/2017 06:01 AM, Michal Privoznik wrote: > >> This API can be used to tell the other side of the stream to skip > > > > s/can be/is (unless it can be used for something else ;-)) > > > >> some bytes in the stream. This can be used to create a sparse > >> file on the receiving side of a stream. > >> > >> It takes just one argument @length, which says how big the hole > >> is. Since our streams are not rewindable like regular files, we > >> don't need @whence argument like seek(2) has. > > > > lseek is an implementation detail... However, it could be stated that > > the skipping would be from the current point in the file forward by some > > number of bytes. It's expected to be used in conjunction with code that > > is copying over the real (or non-zero) data and should be considered an > > optimization over sending zere data segments. > > > >> > >> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > >> --- > >> include/libvirt/libvirt-stream.h | 3 +++ > >> src/driver-stream.h | 5 ++++ > >> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ > >> src/libvirt_public.syms | 1 + > >> 4 files changed, 66 insertions(+) > >> > > > > While it would be unused for now, should @flags be added. Who knows > > what use it could have, but avoids a new Flags API, but does cause a few > > other wording changes here. > > Ah sure. We should have @flags there. Good point. > > > > > Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. > > Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or > > s/Skip/HoleSize/ - ewww). Names would then follow our more recent > > function naming guidelines. I think I dislike the HoleSize much more > > than the Skip. > > SetSkip and GetSkip sound wrong to me instead :D > > > > >> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h > >> index bee2516..4e0a599 100644 > >> --- a/include/libvirt/libvirt-stream.h > >> +++ b/include/libvirt/libvirt-stream.h > >> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, > >> size_t nbytes, > >> unsigned int flags); > >> > >> +int virStreamSkip(virStreamPtr st, > >> + unsigned long long length); > > > > Was there consideration for using 'off_t' instead of ULL? I know it's an > > implementation detail of virFDStreamData and lseek() usage, but it does > > hide things... IDC either way. > > The problem with off_t is that it is signed type, while ULL is unsigned. > There's not much point in sending a negative offset, is there? > Moreover, we use ULL for arguments like offset (not sure really why). > Frankly, I don't really know why. Perhaps some types don't exist everywhere? If anything, we would use size_t, for consistency with the Send/Recv methods. Ultimately though we have a fixed size data type on the wire (64-bit), so mapping to unsigned long long makes sense from that POV, as off_t can in theory be 32-bit in old platforms - similarly how we aim to avoid 'long int' (except for historical mistakes). Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 05/05/2017 04:48 PM, Daniel P. Berrange wrote: > On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: >> On 05/04/2017 11:29 PM, John Ferlan wrote: >>> >>> >>> On 04/20/2017 06:01 AM, Michal Privoznik wrote: >>>> This API can be used to tell the other side of the stream to skip >>> >>> s/can be/is (unless it can be used for something else ;-)) >>> >>>> some bytes in the stream. This can be used to create a sparse >>>> file on the receiving side of a stream. >>>> >>>> It takes just one argument @length, which says how big the hole >>>> is. Since our streams are not rewindable like regular files, we >>>> don't need @whence argument like seek(2) has. >>> >>> lseek is an implementation detail... However, it could be stated that >>> the skipping would be from the current point in the file forward by some >>> number of bytes. It's expected to be used in conjunction with code that >>> is copying over the real (or non-zero) data and should be considered an >>> optimization over sending zere data segments. >>> >>>> >>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >>>> --- >>>> include/libvirt/libvirt-stream.h | 3 +++ >>>> src/driver-stream.h | 5 ++++ >>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ >>>> src/libvirt_public.syms | 1 + >>>> 4 files changed, 66 insertions(+) >>>> >>> >>> While it would be unused for now, should @flags be added. Who knows >>> what use it could have, but avoids a new Flags API, but does cause a few >>> other wording changes here. >> >> Ah sure. We should have @flags there. Good point. >> >>> >>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. >>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or >>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent >>> function naming guidelines. I think I dislike the HoleSize much more >>> than the Skip. >> >> SetSkip and GetSkip sound wrong to me instead :D >> >>> >>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h >>>> index bee2516..4e0a599 100644 >>>> --- a/include/libvirt/libvirt-stream.h >>>> +++ b/include/libvirt/libvirt-stream.h >>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, >>>> size_t nbytes, >>>> unsigned int flags); >>>> >>>> +int virStreamSkip(virStreamPtr st, >>>> + unsigned long long length); >>> >>> Was there consideration for using 'off_t' instead of ULL? I know it's an >>> implementation detail of virFDStreamData and lseek() usage, but it does >>> hide things... IDC either way. >> >> The problem with off_t is that it is signed type, while ULL is unsigned. >> There's not much point in sending a negative offset, is there? >> Moreover, we use ULL for arguments like offset (not sure really why). >> Frankly, I don't really know why. Perhaps some types don't exist everywhere? > > If anything, we would use size_t, for consistency with the Send/Recv > methods. So I've given this some though and ran some experiments. On a 32bit arch I've found this: long long 8 signed size_t 4 unsigned ssize_t 4 signed off_t 4 signed So size_t is 4 bytes long and long long is 8 bytes. This got me thinking, size_t type makes sense for those APIs where we need to address individual bytes. But what would happen if I have the following file on a 32 bit arch: [2MB data] -> [5GB hole] -> [2M data] The hole does not fit into size_t, but it would fit into long long. On the other hand, we are very likely to hit lseek() error as off_t is 4 bytes also (WTF?!). On a 64 bit arch everything is as expected: long long 8 signed size_t 8 unsigned ssize_t 8 signed off_t 8 signed So after all, I think I can switch to size_t. I'm just really surprised to learn that off_t is 4 bytes on a 32 bit arch. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 05/12/2017 03:29 AM, Michal Privoznik wrote: > On 05/05/2017 04:48 PM, Daniel P. Berrange wrote: >> On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: >>> On 05/04/2017 11:29 PM, John Ferlan wrote: >>>> >>>> >>>> On 04/20/2017 06:01 AM, Michal Privoznik wrote: >>>>> This API can be used to tell the other side of the stream to skip >>>> >>>> s/can be/is (unless it can be used for something else ;-)) >>>> >>>>> some bytes in the stream. This can be used to create a sparse >>>>> file on the receiving side of a stream. >>>>> >>>>> It takes just one argument @length, which says how big the hole >>>>> is. Since our streams are not rewindable like regular files, we >>>>> don't need @whence argument like seek(2) has. >>>> >>>> lseek is an implementation detail... However, it could be stated that >>>> the skipping would be from the current point in the file forward by some >>>> number of bytes. It's expected to be used in conjunction with code that >>>> is copying over the real (or non-zero) data and should be considered an >>>> optimization over sending zere data segments. >>>> >>>>> >>>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >>>>> --- >>>>> include/libvirt/libvirt-stream.h | 3 +++ >>>>> src/driver-stream.h | 5 ++++ >>>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ >>>>> src/libvirt_public.syms | 1 + >>>>> 4 files changed, 66 insertions(+) >>>>> >>>> >>>> While it would be unused for now, should @flags be added. Who knows >>>> what use it could have, but avoids a new Flags API, but does cause a few >>>> other wording changes here. >>> >>> Ah sure. We should have @flags there. Good point. >>> >>>> >>>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. >>>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or >>>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent >>>> function naming guidelines. I think I dislike the HoleSize much more >>>> than the Skip. >>> >>> SetSkip and GetSkip sound wrong to me instead :D >>> >>>> >>>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h >>>>> index bee2516..4e0a599 100644 >>>>> --- a/include/libvirt/libvirt-stream.h >>>>> +++ b/include/libvirt/libvirt-stream.h >>>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, >>>>> size_t nbytes, >>>>> unsigned int flags); >>>>> >>>>> +int virStreamSkip(virStreamPtr st, >>>>> + unsigned long long length); >>>> >>>> Was there consideration for using 'off_t' instead of ULL? I know it's an >>>> implementation detail of virFDStreamData and lseek() usage, but it does >>>> hide things... IDC either way. >>> >>> The problem with off_t is that it is signed type, while ULL is unsigned. >>> There's not much point in sending a negative offset, is there? >>> Moreover, we use ULL for arguments like offset (not sure really why). >>> Frankly, I don't really know why. Perhaps some types don't exist everywhere? >> >> If anything, we would use size_t, for consistency with the Send/Recv >> methods. > > So I've given this some though and ran some experiments. On a 32bit arch > I've found this: > > long long 8 signed > size_t 4 unsigned > ssize_t 4 signed > off_t 4 signed > > So size_t is 4 bytes long and long long is 8 bytes. This got me > thinking, size_t type makes sense for those APIs where we need to > address individual bytes. But what would happen if I have the following > file on a 32 bit arch: > > [2MB data] -> [5GB hole] -> [2M data] Would a 5.4G file exist on a 32b arch since the OS couldn't address it nor could it get anything beyond 4GB-1 as a size and then only if using unsigned as the sizing... It's been far too long since I worked on 32 bit arches (or less). The 90's are over aren't they ;-) John > > The hole does not fit into size_t, but it would fit into long long. On > the other hand, we are very likely to hit lseek() error as off_t is 4 > bytes also (WTF?!). On a 64 bit arch everything is as expected: > > long long 8 signed > size_t 8 unsigned > ssize_t 8 signed > off_t 8 signed > > So after all, I think I can switch to size_t. I'm just really surprised > to learn that off_t is 4 bytes on a 32 bit arch. > > Michal > > -- > libvir-list mailing list > libvir-list@redhat.com > https://www.redhat.com/mailman/listinfo/libvir-list > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 05/12/2017 12:56 PM, John Ferlan wrote: > > > On 05/12/2017 03:29 AM, Michal Privoznik wrote: >> On 05/05/2017 04:48 PM, Daniel P. Berrange wrote: >>> On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: >>>> On 05/04/2017 11:29 PM, John Ferlan wrote: >>>>> >>>>> >>>>> On 04/20/2017 06:01 AM, Michal Privoznik wrote: >>>>>> This API can be used to tell the other side of the stream to skip >>>>> >>>>> s/can be/is (unless it can be used for something else ;-)) >>>>> >>>>>> some bytes in the stream. This can be used to create a sparse >>>>>> file on the receiving side of a stream. >>>>>> >>>>>> It takes just one argument @length, which says how big the hole >>>>>> is. Since our streams are not rewindable like regular files, we >>>>>> don't need @whence argument like seek(2) has. >>>>> >>>>> lseek is an implementation detail... However, it could be stated that >>>>> the skipping would be from the current point in the file forward by some >>>>> number of bytes. It's expected to be used in conjunction with code that >>>>> is copying over the real (or non-zero) data and should be considered an >>>>> optimization over sending zere data segments. >>>>> >>>>>> >>>>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >>>>>> --- >>>>>> include/libvirt/libvirt-stream.h | 3 +++ >>>>>> src/driver-stream.h | 5 ++++ >>>>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ >>>>>> src/libvirt_public.syms | 1 + >>>>>> 4 files changed, 66 insertions(+) >>>>>> >>>>> >>>>> While it would be unused for now, should @flags be added. Who knows >>>>> what use it could have, but avoids a new Flags API, but does cause a few >>>>> other wording changes here. >>>> >>>> Ah sure. We should have @flags there. Good point. >>>> >>>>> >>>>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. >>>>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or >>>>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent >>>>> function naming guidelines. I think I dislike the HoleSize much more >>>>> than the Skip. >>>> >>>> SetSkip and GetSkip sound wrong to me instead :D >>>> >>>>> >>>>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h >>>>>> index bee2516..4e0a599 100644 >>>>>> --- a/include/libvirt/libvirt-stream.h >>>>>> +++ b/include/libvirt/libvirt-stream.h >>>>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, >>>>>> size_t nbytes, >>>>>> unsigned int flags); >>>>>> >>>>>> +int virStreamSkip(virStreamPtr st, >>>>>> + unsigned long long length); >>>>> >>>>> Was there consideration for using 'off_t' instead of ULL? I know it's an >>>>> implementation detail of virFDStreamData and lseek() usage, but it does >>>>> hide things... IDC either way. >>>> >>>> The problem with off_t is that it is signed type, while ULL is unsigned. >>>> There's not much point in sending a negative offset, is there? >>>> Moreover, we use ULL for arguments like offset (not sure really why). >>>> Frankly, I don't really know why. Perhaps some types don't exist everywhere? >>> >>> If anything, we would use size_t, for consistency with the Send/Recv >>> methods. >> >> So I've given this some though and ran some experiments. On a 32bit arch >> I've found this: >> >> long long 8 signed >> size_t 4 unsigned >> ssize_t 4 signed >> off_t 4 signed >> >> So size_t is 4 bytes long and long long is 8 bytes. This got me >> thinking, size_t type makes sense for those APIs where we need to >> address individual bytes. But what would happen if I have the following >> file on a 32 bit arch: >> >> [2MB data] -> [5GB hole] -> [2M data] > > Would a 5.4G file exist on a 32b arch since the OS couldn't address it > nor could it get anything beyond 4GB-1 as a size and then only if using > unsigned as the sizing... It's been far too long since I worked on 32 > bit arches (or less). The 90's are over aren't they ;-) Sure it can. Kernel might use long long internally. It definitely has to use bigger type than size_t otherwise we couldn't have >4GB files. But we can. BTW there are other ways to detect holes in files. For instance 'cp' uses ioctl(fd, FS_IOC_FIEMAP, fiemap) to get map of extents on the disk. The fiemap struct then uses uint64_t type to represent addresses and lengths for extents. Therefore I'm gonna stick with my decision and have this unsigned long long. The other argument for using that is: we want ULL to be used at RPC level, right? However, this might clash when mixing 32 and 64 bit client/server as decoding into size_t might fail - ULL does not fit into size_t on 32 bits. Here's an example of a big sparse file: rpi ~ # truncate -s 5G bla.raw rpi ~ # ls -lhs bla.raw 0 -rw-r--r-- 1 root root 5.0G May 12 14:57 bla.raw rpi ~ # uname -a Linux rpi 4.1.15-v7+ #830 SMP Tue Dec 15 17:02:45 GMT 2015 armv7l ARMv7 Processor rev 5 (v7l) BCM2709 GNU/Linux rpi ~ # /home/zippy/tmp/sparse/sparse_map bla.raw hole: 0 len: 5368709120 end: 5368709120 sparse_map is a program I've written for getting data/hole map of a given file. You can find it here: https://pastebin.com/SSjAUi3q BTW: try changing those (long long) typecasts to (size_t) and you'll immediately see why we need long long instead of size_t. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, May 12, 2017 at 09:29:27AM +0200, Michal Privoznik wrote: > On 05/05/2017 04:48 PM, Daniel P. Berrange wrote: > > On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: > >> On 05/04/2017 11:29 PM, John Ferlan wrote: > >>> > >>> > >>> On 04/20/2017 06:01 AM, Michal Privoznik wrote: > >>>> This API can be used to tell the other side of the stream to skip > >>> > >>> s/can be/is (unless it can be used for something else ;-)) > >>> > >>>> some bytes in the stream. This can be used to create a sparse > >>>> file on the receiving side of a stream. > >>>> > >>>> It takes just one argument @length, which says how big the hole > >>>> is. Since our streams are not rewindable like regular files, we > >>>> don't need @whence argument like seek(2) has. > >>> > >>> lseek is an implementation detail... However, it could be stated that > >>> the skipping would be from the current point in the file forward by some > >>> number of bytes. It's expected to be used in conjunction with code that > >>> is copying over the real (or non-zero) data and should be considered an > >>> optimization over sending zere data segments. > >>> > >>>> > >>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > >>>> --- > >>>> include/libvirt/libvirt-stream.h | 3 +++ > >>>> src/driver-stream.h | 5 ++++ > >>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ > >>>> src/libvirt_public.syms | 1 + > >>>> 4 files changed, 66 insertions(+) > >>>> > >>> > >>> While it would be unused for now, should @flags be added. Who knows > >>> what use it could have, but avoids a new Flags API, but does cause a few > >>> other wording changes here. > >> > >> Ah sure. We should have @flags there. Good point. > >> > >>> > >>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. > >>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or > >>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent > >>> function naming guidelines. I think I dislike the HoleSize much more > >>> than the Skip. > >> > >> SetSkip and GetSkip sound wrong to me instead :D > >> > >>> > >>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h > >>>> index bee2516..4e0a599 100644 > >>>> --- a/include/libvirt/libvirt-stream.h > >>>> +++ b/include/libvirt/libvirt-stream.h > >>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, > >>>> size_t nbytes, > >>>> unsigned int flags); > >>>> > >>>> +int virStreamSkip(virStreamPtr st, > >>>> + unsigned long long length); > >>> > >>> Was there consideration for using 'off_t' instead of ULL? I know it's an > >>> implementation detail of virFDStreamData and lseek() usage, but it does > >>> hide things... IDC either way. > >> > >> The problem with off_t is that it is signed type, while ULL is unsigned. > >> There's not much point in sending a negative offset, is there? > >> Moreover, we use ULL for arguments like offset (not sure really why). > >> Frankly, I don't really know why. Perhaps some types don't exist everywhere? > > > > If anything, we would use size_t, for consistency with the Send/Recv > > methods. > > So I've given this some though and ran some experiments. On a 32bit arch > I've found this: > > long long 8 signed > size_t 4 unsigned > ssize_t 4 signed > off_t 4 signed > > So size_t is 4 bytes long and long long is 8 bytes. This got me > thinking, size_t type makes sense for those APIs where we need to > address individual bytes. But what would happen if I have the following > file on a 32 bit arch: > > [2MB data] -> [5GB hole] -> [2M data] > > The hole does not fit into size_t, but it would fit into long long. On > the other hand, we are very likely to hit lseek() error as off_t is 4 > bytes also (WTF?!). On a 64 bit arch everything is as expected: Were you testing libvirt, or testing a standalone demo program ? Libvirt should be defining the macro that enables large file support, which turns off_t into a 64-bit type. So in fact, we would actually be wrong to use size_t - off_t or long long is actually what we need here. I was wrong about consistency with Send/Recv, since the arg there is about the length of the data array which is size_t and this is a different boundary than max file size which is off_t. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 05/15/2017 10:25 AM, Daniel P. Berrange wrote: > On Fri, May 12, 2017 at 09:29:27AM +0200, Michal Privoznik wrote: >> On 05/05/2017 04:48 PM, Daniel P. Berrange wrote: >>> On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: >>>> On 05/04/2017 11:29 PM, John Ferlan wrote: >>>>> >>>>> >>>>> On 04/20/2017 06:01 AM, Michal Privoznik wrote: >>>>>> This API can be used to tell the other side of the stream to skip >>>>> >>>>> s/can be/is (unless it can be used for something else ;-)) >>>>> >>>>>> some bytes in the stream. This can be used to create a sparse >>>>>> file on the receiving side of a stream. >>>>>> >>>>>> It takes just one argument @length, which says how big the hole >>>>>> is. Since our streams are not rewindable like regular files, we >>>>>> don't need @whence argument like seek(2) has. >>>>> >>>>> lseek is an implementation detail... However, it could be stated that >>>>> the skipping would be from the current point in the file forward by some >>>>> number of bytes. It's expected to be used in conjunction with code that >>>>> is copying over the real (or non-zero) data and should be considered an >>>>> optimization over sending zere data segments. >>>>> >>>>>> >>>>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >>>>>> --- >>>>>> include/libvirt/libvirt-stream.h | 3 +++ >>>>>> src/driver-stream.h | 5 ++++ >>>>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ >>>>>> src/libvirt_public.syms | 1 + >>>>>> 4 files changed, 66 insertions(+) >>>>>> >>>>> >>>>> While it would be unused for now, should @flags be added. Who knows >>>>> what use it could have, but avoids a new Flags API, but does cause a few >>>>> other wording changes here. >>>> >>>> Ah sure. We should have @flags there. Good point. >>>> >>>>> >>>>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. >>>>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or >>>>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent >>>>> function naming guidelines. I think I dislike the HoleSize much more >>>>> than the Skip. >>>> >>>> SetSkip and GetSkip sound wrong to me instead :D >>>> >>>>> >>>>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h >>>>>> index bee2516..4e0a599 100644 >>>>>> --- a/include/libvirt/libvirt-stream.h >>>>>> +++ b/include/libvirt/libvirt-stream.h >>>>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, >>>>>> size_t nbytes, >>>>>> unsigned int flags); >>>>>> >>>>>> +int virStreamSkip(virStreamPtr st, >>>>>> + unsigned long long length); >>>>> >>>>> Was there consideration for using 'off_t' instead of ULL? I know it's an >>>>> implementation detail of virFDStreamData and lseek() usage, but it does >>>>> hide things... IDC either way. >>>> >>>> The problem with off_t is that it is signed type, while ULL is unsigned. >>>> There's not much point in sending a negative offset, is there? >>>> Moreover, we use ULL for arguments like offset (not sure really why). >>>> Frankly, I don't really know why. Perhaps some types don't exist everywhere? >>> >>> If anything, we would use size_t, for consistency with the Send/Recv >>> methods. >> >> So I've given this some though and ran some experiments. On a 32bit arch >> I've found this: >> >> long long 8 signed >> size_t 4 unsigned >> ssize_t 4 signed >> off_t 4 signed >> >> So size_t is 4 bytes long and long long is 8 bytes. This got me >> thinking, size_t type makes sense for those APIs where we need to >> address individual bytes. But what would happen if I have the following >> file on a 32 bit arch: >> >> [2MB data] -> [5GB hole] -> [2M data] >> >> The hole does not fit into size_t, but it would fit into long long. On >> the other hand, we are very likely to hit lseek() error as off_t is 4 >> bytes also (WTF?!). On a 64 bit arch everything is as expected: > > Were you testing libvirt, or testing a standalone demo program ? Standalone demo program, that basically does this: #define PRINT_SIZE(x) printf( #x " %d %s\n", sizeof(x), (x)-1 < 0 ? "signed" : "unsigned") PRINT_SIZE(long long); PRINT_SIZE(size_t); PRINT_SIZE(off_t); > > Libvirt should be defining the macro that enables large file support, > which turns off_t into a 64-bit type. So in fact, we would actually > be wrong to use size_t - off_t or long long is actually what we need > here. I think we really need just long long. Consider that even though libvirt enables large file support internally, we don't enable it at the header files level. It's responsibility of developers of mgmt application to do that. Having said that, if we had off_t in public API we can't possibly know its size as it depends on something out of our control. Therefore, I see long long as our only option. Now, question is whether we want signed or unsigned long long. I don't have an opinion about that. On one hand, off_t is signed, but that's because lseek() can seek backwards. We don't have that in our streams. Yet. On the other hand, our streams are different to regular files. I view them as a unidirectional pipe. With some extensions (e.g. sparse messages). lseek() doesn't work over pipes, does it. But then again, long long might be more future proof, if we will ever want to assign a meaning to negative seeks. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Mon, May 15, 2017 at 10:54:03AM +0200, Michal Privoznik wrote: > On 05/15/2017 10:25 AM, Daniel P. Berrange wrote: > > On Fri, May 12, 2017 at 09:29:27AM +0200, Michal Privoznik wrote: > >> On 05/05/2017 04:48 PM, Daniel P. Berrange wrote: > >>> On Fri, May 05, 2017 at 01:25:34PM +0200, Michal Privoznik wrote: > >>>> On 05/04/2017 11:29 PM, John Ferlan wrote: > >>>>> > >>>>> > >>>>> On 04/20/2017 06:01 AM, Michal Privoznik wrote: > >>>>>> This API can be used to tell the other side of the stream to skip > >>>>> > >>>>> s/can be/is (unless it can be used for something else ;-)) > >>>>> > >>>>>> some bytes in the stream. This can be used to create a sparse > >>>>>> file on the receiving side of a stream. > >>>>>> > >>>>>> It takes just one argument @length, which says how big the hole > >>>>>> is. Since our streams are not rewindable like regular files, we > >>>>>> don't need @whence argument like seek(2) has. > >>>>> > >>>>> lseek is an implementation detail... However, it could be stated that > >>>>> the skipping would be from the current point in the file forward by some > >>>>> number of bytes. It's expected to be used in conjunction with code that > >>>>> is copying over the real (or non-zero) data and should be considered an > >>>>> optimization over sending zere data segments. > >>>>> > >>>>>> > >>>>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > >>>>>> --- > >>>>>> include/libvirt/libvirt-stream.h | 3 +++ > >>>>>> src/driver-stream.h | 5 ++++ > >>>>>> src/libvirt-stream.c | 57 ++++++++++++++++++++++++++++++++++++++++ > >>>>>> src/libvirt_public.syms | 1 + > >>>>>> 4 files changed, 66 insertions(+) > >>>>>> > >>>>> > >>>>> While it would be unused for now, should @flags be added. Who knows > >>>>> what use it could have, but avoids a new Flags API, but does cause a few > >>>>> other wording changes here. > >>>> > >>>> Ah sure. We should have @flags there. Good point. > >>>> > >>>>> > >>>>> Perhaps it's just me - but "Skip" and "HoleSize" just seem awkward. > >>>>> Would "virStreamSetSkip" and "virStreamGetSkip" be more apropos? (or > >>>>> s/Skip/HoleSize/ - ewww). Names would then follow our more recent > >>>>> function naming guidelines. I think I dislike the HoleSize much more > >>>>> than the Skip. > >>>> > >>>> SetSkip and GetSkip sound wrong to me instead :D > >>>> > >>>>> > >>>>>> diff --git a/include/libvirt/libvirt-stream.h b/include/libvirt/libvirt-stream.h > >>>>>> index bee2516..4e0a599 100644 > >>>>>> --- a/include/libvirt/libvirt-stream.h > >>>>>> +++ b/include/libvirt/libvirt-stream.h > >>>>>> @@ -50,6 +50,9 @@ int virStreamRecvFlags(virStreamPtr st, > >>>>>> size_t nbytes, > >>>>>> unsigned int flags); > >>>>>> > >>>>>> +int virStreamSkip(virStreamPtr st, > >>>>>> + unsigned long long length); > >>>>> > >>>>> Was there consideration for using 'off_t' instead of ULL? I know it's an > >>>>> implementation detail of virFDStreamData and lseek() usage, but it does > >>>>> hide things... IDC either way. > >>>> > >>>> The problem with off_t is that it is signed type, while ULL is unsigned. > >>>> There's not much point in sending a negative offset, is there? > >>>> Moreover, we use ULL for arguments like offset (not sure really why). > >>>> Frankly, I don't really know why. Perhaps some types don't exist everywhere? > >>> > >>> If anything, we would use size_t, for consistency with the Send/Recv > >>> methods. > >> > >> So I've given this some though and ran some experiments. On a 32bit arch > >> I've found this: > >> > >> long long 8 signed > >> size_t 4 unsigned > >> ssize_t 4 signed > >> off_t 4 signed > >> > >> So size_t is 4 bytes long and long long is 8 bytes. This got me > >> thinking, size_t type makes sense for those APIs where we need to > >> address individual bytes. But what would happen if I have the following > >> file on a 32 bit arch: > >> > >> [2MB data] -> [5GB hole] -> [2M data] > >> > >> The hole does not fit into size_t, but it would fit into long long. On > >> the other hand, we are very likely to hit lseek() error as off_t is 4 > >> bytes also (WTF?!). On a 64 bit arch everything is as expected: > > > > Were you testing libvirt, or testing a standalone demo program ? > > Standalone demo program, that basically does this: > > #define PRINT_SIZE(x) printf( #x " %d %s\n", sizeof(x), (x)-1 < 0 ? > "signed" : "unsigned") > > PRINT_SIZE(long long); > PRINT_SIZE(size_t); > PRINT_SIZE(off_t); > > > > > > Libvirt should be defining the macro that enables large file support, > > which turns off_t into a 64-bit type. So in fact, we would actually > > be wrong to use size_t - off_t or long long is actually what we need > > here. > > I think we really need just long long. Consider that even though libvirt > enables large file support internally, we don't enable it at the header > files level. It's responsibility of developers of mgmt application to do > that. Having said that, if we had off_t in public API we can't possibly > know its size as it depends on something out of our control. Therefore, > I see long long as our only option. Yes, if we used off_t in the file, we'd be ABI-incompatible with programs that don't enable large file. So using long long protects us from that incompatibility - size_t is the only safe magic sized type we can use. > Now, question is whether we want signed or unsigned long long. I don't > have an opinion about that. On one hand, off_t is signed, but that's > because lseek() can seek backwards. We don't have that in our streams. > Yet. On the other hand, our streams are different to regular files. I > view them as a unidirectional pipe. With some extensions (e.g. sparse > messages). lseek() doesn't work over pipes, does it. But then again, > long long might be more future proof, if we will ever want to assign a > meaning to negative seeks. I guess we might as well use signed long long - we aren't gaining anything by using unsigned long long, since the value will be truncated to signed long long when we use it with lseek(). Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 05/15/2017 10:56 AM, Daniel P. Berrange wrote: > On Mon, May 15, 2017 at 10:54:03AM +0200, Michal Privoznik wrote: > >> Now, question is whether we want signed or unsigned long long. I don't >> have an opinion about that. On one hand, off_t is signed, but that's >> because lseek() can seek backwards. We don't have that in our streams. >> Yet. On the other hand, our streams are different to regular files. I >> view them as a unidirectional pipe. With some extensions (e.g. sparse >> messages). lseek() doesn't work over pipes, does it. But then again, >> long long might be more future proof, if we will ever want to assign a >> meaning to negative seeks. > > I guess we might as well use signed long long - we aren't gaining anything > by using unsigned long long, since the value will be truncated to signed > long long when we use it with lseek(). Good point. I'll go with signed long long then. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2026 Red Hat, Inc.