[PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features

Juergen Gross posted 5 patches 3 years, 6 months ago
There is a newer version of this series
[PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Juergen Gross 3 years, 6 months ago
Extend the definition of the Xenstore migration stream to cover new
features:

- per domain features
- extended watches (watch depth)
- per domain quota

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V3:
- new patch
---
 docs/designs/xenstore-migration.md | 85 ++++++++++++++++++++++++++++--
 1 file changed, 82 insertions(+), 3 deletions(-)

diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md
index efa526f420..b2b1d3d5c7 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -43,7 +43,13 @@ the setting of the endianness bit.
 |-----------|---------------------------------------------------|
 | `ident`   | 0x78656e73746f7265 ('xenstore' in ASCII)          |
 |           |                                                   |
-| `version` | 0x00000001 (the version of the specification)     |
+| `version` | The version of the specification, defined values: |
+|           | 0x00000001: all fields without any explicitly     |
+|           |             mentioned version dependency are      |
+|           |             valid.                                |
+|           | 0x00000002: all fields valid for version 1 plus   |
+|           |             fields explicitly stated to be        |
+|           |             supported in version 2 are valid.     |
 |           |                                                   |
 | `flags`   | 0 (LSB): Endianness: 0 = little, 1 = big          |
 |           |                                                   |
@@ -117,7 +123,17 @@ xenstored state that needs to be restored.
 | rw-socket-fd                  |
 +-------------------------------+
 | evtchn-fd                     |
++---------------+---------------+
+| n-dom-quota   | n-glob-quota  |
++---------------+---------------+
+| quota-val 1                   |
++-------------------------------+
+...
 +-------------------------------+
+| quota-val N                   |
++-------------------------------+
+| quota-names
+...
 ```
 
 
@@ -128,6 +144,22 @@ xenstored state that needs to be restored.
 |                |                                              |
 | `evtchn-fd`    | The file descriptor used to communicate with |
 |                | the event channel driver                     |
+|                |                                              |
+| `n-dom-quota`  | Number of quota values which apply per       |
+|                | domain. Valid only for version 2 and later.  |
+|                |                                              |
+| `n-glob-quota` | Number of quota values which apply globally  |
+|                | only. Valid only for version 2 and later.    |
+|                |                                              |
+| `quota-val`    | Quota values, first the ones applying per    |
+|                | domain, then the ones applying globally. A   |
+|                | value of 0 has the semantics of "unlimited". |
+|                | Valid only for version 2 and later.          |
+|                |                                              |
+| `quota-names`  | 0 delimited strings of the quota names in    |
+|                | the same sequence as the `quota-val` values. |
+|                | Valid only for version 2 and later.          |
+
 
 xenstored will resume in the original process context. Hence `rw-socket-fd`
 simply specifies the file descriptor of the socket. Sockets are not always
@@ -145,7 +177,7 @@ the domain being migrated.
 ```
     0       1       2       3       4       5       6       7    octet
 +-------+-------+-------+-------+-------+-------+-------+-------+
-| conn-id                       | conn-type     |               |
+| conn-id                       | conn-type     | n-quota       |
 +-------------------------------+---------------+---------------+
 | conn-spec
 ...
@@ -154,6 +186,17 @@ the domain being migrated.
 +---------------+---------------+-------------------------------+
 | data
 ...
++-------------------------------+
+| features                      |
++-------------------------------+
+| quota-val 1                   |
++-------------------------------+
+...
++-------------------------------+
+| quota-val N                   |
++-------------------------------+
+| quota-names
+...
 ```
 
 
@@ -167,6 +210,10 @@ the domain being migrated.
 |                | 0x0001: socket                               |
 |                | 0x0002 - 0xFFFF: reserved for future use     |
 |                |                                              |
+| `n-quota`      | Number of quota values.                      |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |
+|                |                                              |
 | `conn-spec`    | See below                                    |
 |                |                                              |
 | `in-data-len`  | The length (in octets) of any data read      |
@@ -182,6 +229,22 @@ the domain being migrated.
 | `data`         | Pending data: first in-data-len octets of    |
 |                | read data, then out-data-len octets of       |
 |                | written data (any of both may be empty)      |
+|                |                                              |
+| `features`     | Value of the feature field visible by the    |
+|                | guest at offset 2064 of the ring page.       |
+|                | Aligned to the next 4 octet boundary.        |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |
+|                |                                              |
+| `quota-val`    | Quota values, a value of 0 has the semantics |
+|                | "unlimited".                                 |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |
+|                |                                              |
+| `quota-names`  | 0 delimited strings of the quota names in    |
+|                | the same sequence as the `quota-val` values. |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |
 
 In case of live update the connection record for the connection via which
 the live update command was issued will contain the response for the live
@@ -247,7 +310,7 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
 
 ```
     0       1       2       3    octet
-+-------+-------+-------+-------+
++---------------+---------------+
 | conn-id                       |
 +---------------+---------------+
 | wpath-len     | token-len     |
@@ -256,6 +319,9 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
 ...
 | token
 ...
++---------------+---------------+
+| depth         |               |
++---------------+---------------+
 ```
 
 
@@ -275,6 +341,13 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
 |             |                                                 |
 | `token`     | The watch identifier token, as specified in the |
 |             | `WATCH` operation                               |
+|             |                                                 |
+| `depth`     | The number of directory levels below the        |
+|             | watched path to consider for a match. This      |
+|             | field is aligned to the next 4 octet boundary.  |
+|             | A value of 0xffff is used for unlimited depth.  |
+|             | This field is valid only for version 2 and      |
+|             | higher.                                         |
 
 \pagebreak
 
@@ -406,6 +479,12 @@ A node permission specifier has the following format:
 Note that perm1 defines the domain owning the node. See [4] for more
 explanation of node permissions.
 
+\pagebreak
+
+### DOMAIN_DATA
+
+
+
 * * *
 
 [1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md
-- 
2.35.3
Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Julien Grall 3 years, 6 months ago
Hi,

On 03/08/2022 12:59, Juergen Gross wrote:
> Extend the definition of the Xenstore migration stream to cover new
> features:
> 
> - per domain features
> - extended watches (watch depth)
> - per domain quota
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> V3:
> - new patch
> ---
>   docs/designs/xenstore-migration.md | 85 ++++++++++++++++++++++++++++--
>   1 file changed, 82 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/designs/xenstore-migration.md b/docs/designs/xenstore-migration.md
> index efa526f420..b2b1d3d5c7 100644
> --- a/docs/designs/xenstore-migration.md
> +++ b/docs/designs/xenstore-migration.md
> @@ -43,7 +43,13 @@ the setting of the endianness bit.
>   |-----------|---------------------------------------------------|
>   | `ident`   | 0x78656e73746f7265 ('xenstore' in ASCII)          |
>   |           |                                                   |
> -| `version` | 0x00000001 (the version of the specification)     |
> +| `version` | The version of the specification, defined values: |
> +|           | 0x00000001: all fields without any explicitly     |
> +|           |             mentioned version dependency are      |
> +|           |             valid.                                |
> +|           | 0x00000002: all fields valid for version 1 plus   |
> +|           |             fields explicitly stated to be        |
> +|           |             supported in version 2 are valid.     |

I am a bit concerned with the bump of the versions. It means, it will be 
necessary for Xenstored to know whether the new Xenstored speaks v1 or 
v2. This is less an issue when Live-Migration (although there is a fleet 
management problem) but it will be one for Live-Update if we need to 
rollback.

So I am wondering if we can avoid to bump the version and use other 
means to detect the difference.

>   |           |                                                   |
>   | `flags`   | 0 (LSB): Endianness: 0 = little, 1 = big          |
>   |           |                                                   |
> @@ -117,7 +123,17 @@ xenstored state that needs to be restored.
>   | rw-socket-fd                  |
>   +-------------------------------+
>   | evtchn-fd                     |
> ++---------------+---------------+
> +| n-dom-quota   | n-glob-quota  |
> ++---------------+---------------+
> +| quota-val 1                   |
> ++-------------------------------+
> +...
>   +-------------------------------+
> +| quota-val N                   |
> ++-------------------------------+
> +| quota-names
> +...
>   ```
>   
>   
> @@ -128,6 +144,22 @@ xenstored state that needs to be restored.
>   |                |                                              |
>   | `evtchn-fd`    | The file descriptor used to communicate with |
>   |                | the event channel driver                     |
> +|                |                                              |
> +| `n-dom-quota`  | Number of quota values which apply per       |
> +|                | domain. Valid only for version 2 and later.  |
> +|                |                                              |

I think we can avoid extending the structure here by creating a separate 
record for the quota.

With that, the new Xenstored don't need specific code to deal with 
rollback because, as AFAICT, unimplemented records are ignored by 
existing Xenstored.

> +| `n-glob-quota` | Number of quota values which apply globally  |
> +|                | only. Valid only for version 2 and later.    |
> +|                |                                              |
> +| `quota-val`    | Quota values, first the ones applying per    |
> +|                | domain, then the ones applying globally. A   |
> +|                | value of 0 has the semantics of "unlimited". |
> +|                | Valid only for version 2 and later.          |
> +|                |                                              |
> +| `quota-names`  | 0 delimited strings of the quota names in    |
> +|                | the same sequence as the `quota-val` values. | > +|                | Valid only for version 2 and later.          |

 From my understanding, both version of Xenstored needs to agree on the 
quota names. So it means the name have to be defined as part of the 
spec. At which point, I think it would be better to use ID.

Also, can you clarify what would happen if the stream contains a quota 
not supported by the new Xenstored?

> +
>   
>   xenstored will resume in the original process context. Hence `rw-socket-fd`
>   simply specifies the file descriptor of the socket. Sockets are not always
> @@ -145,7 +177,7 @@ the domain being migrated.
>   ```
>       0       1       2       3       4       5       6       7    octet
>   +-------+-------+-------+-------+-------+-------+-------+-------+
> -| conn-id                       | conn-type     |               |
> +| conn-id                       | conn-type     | n-quota       |
>   +-------------------------------+---------------+---------------+
>   | conn-spec
>   ...
> @@ -154,6 +186,17 @@ the domain being migrated.
>   +---------------+---------------+-------------------------------+
>   | data
>   ...
> ++-------------------------------+
> +| features                      |
> ++-------------------------------+
> +| quota-val 1                   |
> ++-------------------------------+
> +...
> ++-------------------------------+
> +| quota-val N                   |
> ++-------------------------------+
> +| quota-names
> +...
>   ```
>   
>   
> @@ -167,6 +210,10 @@ the domain being migrated.
>   |                | 0x0001: socket                               |
>   |                | 0x0002 - 0xFFFF: reserved for future use     |
>   |                |                                              |
> +| `n-quota`      | Number of quota values.                      |
> +|                | Only for `conn-type` 0 (shared ring).        |
> +|                | Only valid for version 2 and later.          |
> +|                |                                              |
>   | `conn-spec`    | See below                                    |
>   |                |                                              |
>   | `in-data-len`  | The length (in octets) of any data read      |
> @@ -182,6 +229,22 @@ the domain being migrated.
>   | `data`         | Pending data: first in-data-len octets of    |
>   |                | read data, then out-data-len octets of       |
>   |                | written data (any of both may be empty)      |
> +|                |                                              |
> +| `features`     | Value of the feature field visible by the    |
> +|                | guest at offset 2064 of the ring page.       |
> +|                | Aligned to the next 4 octet boundary.        |
> +|                | Only for `conn-type` 0 (shared ring).        |

For the purpose of the stream, I would consider to make it available for 
the socket connection. This could potentially be used in the future to 
allow each application to have a different behavior when socket is used.

I can't make my mind yet if we can avoid bumping the version for this 
field. What would happen if we need to rollback?

> +|                | Only valid for version 2 and later.          | > +|                |                                              |
> +| `quota-val`    | Quota values, a value of 0 has the semantics |
> +|                | "unlimited".                                 |
> +|                | Only for `conn-type` 0 (shared ring).        |
> +|                | Only valid for version 2 and later.          |

I would suggest to be very obvious and say the value will override the 
value (if any)

> +|                |                                              |
> +| `quota-names`  | 0 delimited strings of the quota names in    |
> +|                | the same sequence as the `quota-val` values. |
> +|                | Only for `conn-type` 0 (shared ring).        |
> +|                | Only valid for version 2 and later.          |

As for the "global" quotas, I would move the quotas in a separate 
record. In this case, this would also be useful to avoid having may 
dynamic length field within the same record.

>   
>   In case of live update the connection record for the connection via which
>   the live update command was issued will contain the response for the live
> @@ -247,7 +310,7 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
>   
>   ```
>       0       1       2       3    octet
> -+-------+-------+-------+-------+
> ++---------------+---------------+
>   | conn-id                       |
>   +---------------+---------------+
>   | wpath-len     | token-len     |
> @@ -256,6 +319,9 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
>   ...
>   | token
>   ...
> ++---------------+---------------+
> +| depth         |               |
> ++---------------+---------------+
>   ```
>   
>   
> @@ -275,6 +341,13 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
>   |             |                                                 |
>   | `token`     | The watch identifier token, as specified in the |
>   |             | `WATCH` operation                               |
> +|             |                                                 |
> +| `depth`     | The number of directory levels below the        |
> +|             | watched path to consider for a match. This      |
> +|             | field is aligned to the next 4 octet boundary.  |
> +|             | A value of 0xffff is used for unlimited depth.  |
> +|             | This field is valid only for version 2 and      |
> +|             | higher.                                         |

If we are going to bump the stream version, then I think we should move 
the field before token/path.

>   
>   \pagebreak
>   
> @@ -406,6 +479,12 @@ A node permission specifier has the following format:
>   Note that perm1 defines the domain owning the node. See [4] for more
>   explanation of node permissions.
>   
> +\pagebreak
> +
> +### DOMAIN_DATA
> +
> +
> +

What this section is for?

Cheers,

>   * * *
>   
>   [1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md

-- 
Julien Grall
Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Juergen Gross 3 years, 6 months ago
On 04.08.22 21:28, Julien Grall wrote:
> Hi,
> 
> On 03/08/2022 12:59, Juergen Gross wrote:
>> Extend the definition of the Xenstore migration stream to cover new
>> features:
>>
>> - per domain features
>> - extended watches (watch depth)
>> - per domain quota
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> V3:
>> - new patch
>> ---
>>   docs/designs/xenstore-migration.md | 85 ++++++++++++++++++++++++++++--
>>   1 file changed, 82 insertions(+), 3 deletions(-)
>>
>> diff --git a/docs/designs/xenstore-migration.md 
>> b/docs/designs/xenstore-migration.md
>> index efa526f420..b2b1d3d5c7 100644
>> --- a/docs/designs/xenstore-migration.md
>> +++ b/docs/designs/xenstore-migration.md
>> @@ -43,7 +43,13 @@ the setting of the endianness bit.
>>   |-----------|---------------------------------------------------|
>>   | `ident`   | 0x78656e73746f7265 ('xenstore' in ASCII)          |
>>   |           |                                                   |
>> -| `version` | 0x00000001 (the version of the specification)     |
>> +| `version` | The version of the specification, defined values: |
>> +|           | 0x00000001: all fields without any explicitly     |
>> +|           |             mentioned version dependency are      |
>> +|           |             valid.                                |
>> +|           | 0x00000002: all fields valid for version 1 plus   |
>> +|           |             fields explicitly stated to be        |
>> +|           |             supported in version 2 are valid.     |
> 
> I am a bit concerned with the bump of the versions. It means, it will be 
> necessary for Xenstored to know whether the new Xenstored speaks v1 or v2. This 
> is less an issue when Live-Migration (although there is a fleet management 
> problem) but it will be one for Live-Update if we need to rollback.
> 
> So I am wondering if we can avoid to bump the version and use other means to 
> detect the difference.

In the end this is exactly what the version was meant to be used for.

I think it would make much more sense to think about the way to handle a
bump of the version in a compatible way.

My idea was to add a xenstored command line parameter for limiting the
migration stream version to be used to a specified version, causing new
features probably to not be available, though.

I don't see how e.g. a rollback would be doable in case a domain already
started to use a new feature like the third parameter when setting a watch.
Even if we'd drop the depth information when rolling back a watch set
afterwards with an additional depth added would be rejected by the older
xenstored, which would be unexpected failure for the guest.

It might make sense to try to use the V1 stream when doing a live update,
e.g. covering the case when the SET_FEATURE command was used for each
active guest to limit the features to V1 compatible ones. A force parameter
might be used to use the V1 stream even if guests are using V2 features,
risking breakage of those guests.

> 
>>   |           |                                                   |
>>   | `flags`   | 0 (LSB): Endianness: 0 = little, 1 = big          |
>>   |           |                                                   |
>> @@ -117,7 +123,17 @@ xenstored state that needs to be restored.
>>   | rw-socket-fd                  |
>>   +-------------------------------+
>>   | evtchn-fd                     |
>> ++---------------+---------------+
>> +| n-dom-quota   | n-glob-quota  |
>> ++---------------+---------------+
>> +| quota-val 1                   |
>> ++-------------------------------+
>> +...
>>   +-------------------------------+
>> +| quota-val N                   |
>> ++-------------------------------+
>> +| quota-names
>> +...
>>   ```
>> @@ -128,6 +144,22 @@ xenstored state that needs to be restored.
>>   |                |                                              |
>>   | `evtchn-fd`    | The file descriptor used to communicate with |
>>   |                | the event channel driver                     |
>> +|                |                                              |
>> +| `n-dom-quota`  | Number of quota values which apply per       |
>> +|                | domain. Valid only for version 2 and later.  |
>> +|                |                                              |
> 
> I think we can avoid extending the structure here by creating a separate record 
> for the quota.
> 
> With that, the new Xenstored don't need specific code to deal with rollback 
> because, as AFAICT, unimplemented records are ignored by existing Xenstored.

For quota this would be possible regarding the functionality, but there are
use cases which might suffer in case quota are not being accepted by the
new instance (e.g. driver domains needing higher quota).

OTOH this is no guest visible interface, so I'm on the edge to add this
data to V1. This would even be possible using the global record of V1, as
the length information of the record allows to add new fields without
having to bump the version.

> 
>> +| `n-glob-quota` | Number of quota values which apply globally  |
>> +|                | only. Valid only for version 2 and later.    |
>> +|                |                                              |
>> +| `quota-val`    | Quota values, first the ones applying per    |
>> +|                | domain, then the ones applying globally. A   |
>> +|                | value of 0 has the semantics of "unlimited". |
>> +|                | Valid only for version 2 and later.          |
>> +|                |                                              |
>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>> +|                | the same sequence as the `quota-val` values. | > 
>> +|                | Valid only for version 2 and later.          |
> 
>  From my understanding, both version of Xenstored needs to agree on the quota 
> names. So it means the name have to be defined as part of the spec. At which 
> point, I think it would be better to use ID.

I don't think so. For one the minimal set of quota names has been defined
already in patch 3. And even with using an ID you'd have the same problem
again, but without having the possibility to add variant specific quota
(remember that there already has been a statement that doing a live update
from C to OCAML or vice versa would probably break users due to some
deviations in behavior).

> Also, can you clarify what would happen if the stream contains a quota not 
> supported by the new Xenstored?

Yes, I'll add a sentence that those should be ignored.

> 
>> +
>>   xenstored will resume in the original process context. Hence `rw-socket-fd`
>>   simply specifies the file descriptor of the socket. Sockets are not always
>> @@ -145,7 +177,7 @@ the domain being migrated.
>>   ```
>>       0       1       2       3       4       5       6       7    octet
>>   +-------+-------+-------+-------+-------+-------+-------+-------+
>> -| conn-id                       | conn-type     |               |
>> +| conn-id                       | conn-type     | n-quota       |
>>   +-------------------------------+---------------+---------------+
>>   | conn-spec
>>   ...
>> @@ -154,6 +186,17 @@ the domain being migrated.
>>   +---------------+---------------+-------------------------------+
>>   | data
>>   ...
>> ++-------------------------------+
>> +| features                      |
>> ++-------------------------------+
>> +| quota-val 1                   |
>> ++-------------------------------+
>> +...
>> ++-------------------------------+
>> +| quota-val N                   |
>> ++-------------------------------+
>> +| quota-names
>> +...
>>   ```
>> @@ -167,6 +210,10 @@ the domain being migrated.
>>   |                | 0x0001: socket                               |
>>   |                | 0x0002 - 0xFFFF: reserved for future use     |
>>   |                |                                              |
>> +| `n-quota`      | Number of quota values.                      |
>> +|                | Only for `conn-type` 0 (shared ring).        |
>> +|                | Only valid for version 2 and later.          |
>> +|                |                                              |
>>   | `conn-spec`    | See below                                    |
>>   |                |                                              |
>>   | `in-data-len`  | The length (in octets) of any data read      |
>> @@ -182,6 +229,22 @@ the domain being migrated.
>>   | `data`         | Pending data: first in-data-len octets of    |
>>   |                | read data, then out-data-len octets of       |
>>   |                | written data (any of both may be empty)      |
>> +|                |                                              |
>> +| `features`     | Value of the feature field visible by the    |
>> +|                | guest at offset 2064 of the ring page.       |
>> +|                | Aligned to the next 4 octet boundary.        |
>> +|                | Only for `conn-type` 0 (shared ring).        |
> 
> For the purpose of the stream, I would consider to make it available for the 
> socket connection. This could potentially be used in the future to allow each 
> application to have a different behavior when socket is used.

This would break the use of xenstore-stubdom for such a setup.

> I can't make my mind yet if we can avoid bumping the version for this field. 
> What would happen if we need to rollback?

I think an active usage of the new features and a rollback are mutually
exclusive. See above.

> 
>> +|                | Only valid for version 2 and later.          | > 
>> +|                |                                              |
>> +| `quota-val`    | Quota values, a value of 0 has the semantics |
>> +|                | "unlimited".                                 |
>> +|                | Only for `conn-type` 0 (shared ring).        |
>> +|                | Only valid for version 2 and later.          |
> 
> I would suggest to be very obvious and say the value will override the value (if 
> any)

Okay.

> 
>> +|                |                                              |
>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>> +|                | the same sequence as the `quota-val` values. |
>> +|                | Only for `conn-type` 0 (shared ring).        |
>> +|                | Only valid for version 2 and later.          |
> 
> As for the "global" quotas, I would move the quotas in a separate record. In 
> this case, this would also be useful to avoid having may dynamic length field 
> within the same record.

I like having the data together more.

But I'm not really feeling strong about that.

> 
>>   In case of live update the connection record for the connection via which
>>   the live update command was issued will contain the response for the live
>> @@ -247,7 +310,7 @@ by a connection for which there is `CONNECTION_DATA` 
>> record previously present.
>>   ```
>>       0       1       2       3    octet
>> -+-------+-------+-------+-------+
>> ++---------------+---------------+
>>   | conn-id                       |
>>   +---------------+---------------+
>>   | wpath-len     | token-len     |
>> @@ -256,6 +319,9 @@ by a connection for which there is `CONNECTION_DATA` 
>> record previously present.
>>   ...
>>   | token
>>   ...
>> ++---------------+---------------+
>> +| depth         |               |
>> ++---------------+---------------+
>>   ```
>> @@ -275,6 +341,13 @@ by a connection for which there is `CONNECTION_DATA` 
>> record previously present.
>>   |             |                                                 |
>>   | `token`     | The watch identifier token, as specified in the |
>>   |             | `WATCH` operation                               |
>> +|             |                                                 |
>> +| `depth`     | The number of directory levels below the        |
>> +|             | watched path to consider for a match. This      |
>> +|             | field is aligned to the next 4 octet boundary.  |
>> +|             | A value of 0xffff is used for unlimited depth.  |
>> +|             | This field is valid only for version 2 and      |
>> +|             | higher.                                         |
> 
> If we are going to bump the stream version, then I think we should move the 
> field before token/path.

I thought about that, but liked it better to be able to keep a common struct
layout for the record with the V2 fields being at the end.

Main reason is the ability to avoid duplication of code for being able to
handle both versions.

> 
>>   \pagebreak
>> @@ -406,6 +479,12 @@ A node permission specifier has the following format:
>>   Note that perm1 defines the domain owning the node. See [4] for more
>>   explanation of node permissions.
>> +\pagebreak
>> +
>> +### DOMAIN_DATA
>> +
>> +
>> +
> 
> What this section is for?

Oh that is a stale result of a previous variant considering an own record
type for quota and feature information. :-)


Juergen
Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Julien Grall 3 years, 6 months ago
Hi Juergen,

On 08/08/2022 07:33, Juergen Gross wrote:
> On 04.08.22 21:28, Julien Grall wrote:
>> On 03/08/2022 12:59, Juergen Gross wrote:
>>> Extend the definition of the Xenstore migration stream to cover new
>>> features:
>>>
>>> - per domain features
>>> - extended watches (watch depth)
>>> - per domain quota
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>> V3:
>>> - new patch
>>> ---
>>>   docs/designs/xenstore-migration.md | 85 ++++++++++++++++++++++++++++--
>>>   1 file changed, 82 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/docs/designs/xenstore-migration.md 
>>> b/docs/designs/xenstore-migration.md
>>> index efa526f420..b2b1d3d5c7 100644
>>> --- a/docs/designs/xenstore-migration.md
>>> +++ b/docs/designs/xenstore-migration.md
>>> @@ -43,7 +43,13 @@ the setting of the endianness bit.
>>>   |-----------|---------------------------------------------------|
>>>   | `ident`   | 0x78656e73746f7265 ('xenstore' in ASCII)          |
>>>   |           |                                                   |
>>> -| `version` | 0x00000001 (the version of the specification)     |
>>> +| `version` | The version of the specification, defined values: |
>>> +|           | 0x00000001: all fields without any explicitly     |
>>> +|           |             mentioned version dependency are      |
>>> +|           |             valid.                                |
>>> +|           | 0x00000002: all fields valid for version 1 plus   |
>>> +|           |             fields explicitly stated to be        |
>>> +|           |             supported in version 2 are valid.     |
>>
>> I am a bit concerned with the bump of the versions. It means, it will 
>> be necessary for Xenstored to know whether the new Xenstored speaks v1 
>> or v2. This is less an issue when Live-Migration (although there is a 
>> fleet management problem) but it will be one for Live-Update if we 
>> need to rollback.
>>
>> So I am wondering if we can avoid to bump the version and use other 
>> means to detect the difference.
> 
> In the end this is exactly what the version was meant to be used for.
> 
> I think it would make much more sense to think about the way to handle a
> bump of the version in a compatible way.
> 
> My idea was to add a xenstored command line parameter for limiting the
> migration stream version to be used to a specified version, causing new
> features probably to not be available, though.

I think this is fine. Someone that cares about rollback will also likely 
care about fleet diversity. So they will want to avoid enabling a 
feature until they know it can work everywhere.

> 
> I don't see how e.g. a rollback would be doable in case a domain already
> started to use a new feature like the third parameter when setting a watch.
> Even if we'd drop the depth information when rolling back a watch set
> afterwards with an additional depth added would be rejected by the older
> xenstored, which would be unexpected failure for the guest.

See above.

> 
> It might make sense to try to use the V1 stream when doing a live update,
> e.g. covering the case when the SET_FEATURE command was used for each
> active guest to limit the features to V1 compatible ones. A force parameter
> might be used to use the V1 stream even if guests are using V2 features,
> risking breakage of those guests.

I don't have a strong opinion on this yet. I might have some when seen 
the code :).

[...]

> This would even be possible using the global record of V1, as
> the length information of the record allows to add new fields without
> having to bump the version.

I was actually thinking about this when writing the e-mail last week. 
There are no dynamic length array in the global records so far, so using 
the length information would be ok. I am more concerned about the others 
because we are mixing fixed and dynamic length.

This means it is more difficult to read the code and the layout.

> 
>>
>>> +| `n-glob-quota` | Number of quota values which apply globally  |
>>> +|                | only. Valid only for version 2 and later.    |
>>> +|                |                                              |
>>> +| `quota-val`    | Quota values, first the ones applying per    |
>>> +|                | domain, then the ones applying globally. A   |
>>> +|                | value of 0 has the semantics of "unlimited". |
>>> +|                | Valid only for version 2 and later.          |
>>> +|                |                                              |
>>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>>> +|                | the same sequence as the `quota-val` values. | > 
>>> +|                | Valid only for version 2 and later.          |
>>
>>  From my understanding, both version of Xenstored needs to agree on 
>> the quota names. So it means the name have to be defined as part of 
>> the spec. At which point, I think it would be better to use ID.
> 
> I don't think so. For one the minimal set of quota names has been defined
> already in patch 3.

Someone reading the migration stream will not necessarily read the 
Xenstore protocol. So I think we should either make them explicit in the 
documentation or have a link to the other document.

> And even with using an ID you'd have the same problem
> again, but without having the possibility to add variant specific quota

Fair enough.

> (remember that there already has been a statement that doing a live update
> from C to OCAML or vice versa would probably break users due to some
> deviations in behavior)
I can't find such statement in public documentation. Do you have a link?

That said, a guest doesn't have a (easy?) way to know how Xenstored is 
implemented. So it is quite concerning to hear some of them may rely on 
behaviors. How did that happen?

> 
>> Also, can you clarify what would happen if the stream contains a quota 
>> not supported by the new Xenstored?
> 
> Yes, I'll add a sentence that those should be ignored.
> 
>>
>>> +
>>>   xenstored will resume in the original process context. Hence 
>>> `rw-socket-fd`
>>>   simply specifies the file descriptor of the socket. Sockets are not 
>>> always
>>> @@ -145,7 +177,7 @@ the domain being migrated.
>>>   ```
>>>       0       1       2       3       4       5       6       7    octet
>>>   +-------+-------+-------+-------+-------+-------+-------+-------+
>>> -| conn-id                       | conn-type     |               |
>>> +| conn-id                       | conn-type     | n-quota       |
>>>   +-------------------------------+---------------+---------------+
>>>   | conn-spec
>>>   ...
>>> @@ -154,6 +186,17 @@ the domain being migrated.
>>>   +---------------+---------------+-------------------------------+
>>>   | data
>>>   ...
>>> ++-------------------------------+
>>> +| features                      |
>>> ++-------------------------------+
>>> +| quota-val 1                   |
>>> ++-------------------------------+
>>> +...
>>> ++-------------------------------+
>>> +| quota-val N                   |
>>> ++-------------------------------+
>>> +| quota-names
>>> +...
>>>   ```
>>> @@ -167,6 +210,10 @@ the domain being migrated.
>>>   |                | 0x0001: socket                               |
>>>   |                | 0x0002 - 0xFFFF: reserved for future use     |
>>>   |                |                                              |
>>> +| `n-quota`      | Number of quota values.                      |
>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>> +|                | Only valid for version 2 and later.          |
>>> +|                |                                              |
>>>   | `conn-spec`    | See below                                    |
>>>   |                |                                              |
>>>   | `in-data-len`  | The length (in octets) of any data read      |
>>> @@ -182,6 +229,22 @@ the domain being migrated.
>>>   | `data`         | Pending data: first in-data-len octets of    |
>>>   |                | read data, then out-data-len octets of       |
>>>   |                | written data (any of both may be empty)      |
>>> +|                |                                              |
>>> +| `features`     | Value of the feature field visible by the    |
>>> +|                | guest at offset 2064 of the ring page.       |
>>> +|                | Aligned to the next 4 octet boundary.        |
>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>
>> For the purpose of the stream, I would consider to make it available 
>> for the socket connection. This could potentially be used in the 
>> future to allow each application to have a different behavior when 
>> socket is used.
> 
> This would break the use of xenstore-stubdom for such a setup.

I am not sure why it would break the use of xenstore-stubdom. An 
application will already need to cope with the case Xenstored doesn't 
support a feature.

At which point, it would be easy to say "I don't want this feature" when 
using a socket.

> 
>> I can't make my mind yet if we can avoid bumping the version for this 
>> field. What would happen if we need to rollback?
> 
> I think an active usage of the new features and a rollback are mutually
> exclusive. See above.
>>> +|                |                                              |
>>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>>> +|                | the same sequence as the `quota-val` values. |
>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>> +|                | Only valid for version 2 and later.          |
>>
>> As for the "global" quotas, I would move the quotas in a separate 
>> record. In this case, this would also be useful to avoid having may 
>> dynamic length field within the same record.
> 
> I like having the data together more.

Which is fine so long the code doesn't become too horrible to 
read/maintain. I think having dynamic length array in the middle of the 
record makes it trickier.

This will only become worse as we introduce new fields in newer 
revision. So at which point would you say the record has grown too much?

To me, this is already the point and we have plenty of record ID to 
handle that.

>>
>>>   In case of live update the connection record for the connection via 
>>> which
>>>   the live update command was issued will contain the response for 
>>> the live
>>> @@ -247,7 +310,7 @@ by a connection for which there is 
>>> `CONNECTION_DATA` record previously present.
>>>   ```
>>>       0       1       2       3    octet
>>> -+-------+-------+-------+-------+
>>> ++---------------+---------------+
>>>   | conn-id                       |
>>>   +---------------+---------------+
>>>   | wpath-len     | token-len     |
>>> @@ -256,6 +319,9 @@ by a connection for which there is 
>>> `CONNECTION_DATA` record previously present.
>>>   ...
>>>   | token
>>>   ...
>>> ++---------------+---------------+
>>> +| depth         |               |
>>> ++---------------+---------------+
>>>   ```
>>> @@ -275,6 +341,13 @@ by a connection for which there is 
>>> `CONNECTION_DATA` record previously present.
>>>   |             |                                                 |
>>>   | `token`     | The watch identifier token, as specified in the |
>>>   |             | `WATCH` operation                               |
>>> +|             |                                                 |
>>> +| `depth`     | The number of directory levels below the        |
>>> +|             | watched path to consider for a match. This      |
>>> +|             | field is aligned to the next 4 octet boundary.  |
>>> +|             | A value of 0xffff is used for unlimited depth.  |
>>> +|             | This field is valid only for version 2 and      |
>>> +|             | higher.                                         |
>>
>> If we are going to bump the stream version, then I think we should 
>> move the field before token/path.
> 
> I thought about that, but liked it better to be able to keep a common 
> struct
> layout for the record with the V2 fields being at the end.
> 
> Main reason is the ability to avoid duplication of code for being able to
> handle both versions.

The cons is you can't easily describe the record in "struct ...". As I 
wrote above, I think have dynamic length array in the middle of a record 
is wrong.

I have looked at the code, I don't think there will be enough code 
duplication to warrant adding fixed field at the end of the record.

Cheers,

-- 
Julien Grall

Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Juergen Gross 3 years, 6 months ago
On 08.08.22 13:00, Julien Grall wrote:
> Hi Juergen,
> 
> On 08/08/2022 07:33, Juergen Gross wrote:
>> On 04.08.22 21:28, Julien Grall wrote:
>>> On 03/08/2022 12:59, Juergen Gross wrote:
>>>> Extend the definition of the Xenstore migration stream to cover new
>>>> features:
>>>>
>>>> - per domain features
>>>> - extended watches (watch depth)
>>>> - per domain quota
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>> V3:
>>>> - new patch
>>>> ---
>>>>   docs/designs/xenstore-migration.md | 85 ++++++++++++++++++++++++++++--
>>>>   1 file changed, 82 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/docs/designs/xenstore-migration.md 
>>>> b/docs/designs/xenstore-migration.md
>>>> index efa526f420..b2b1d3d5c7 100644
>>>> --- a/docs/designs/xenstore-migration.md
>>>> +++ b/docs/designs/xenstore-migration.md
>>>> @@ -43,7 +43,13 @@ the setting of the endianness bit.
>>>>   |-----------|---------------------------------------------------|
>>>>   | `ident`   | 0x78656e73746f7265 ('xenstore' in ASCII)          |
>>>>   |           |                                                   |
>>>> -| `version` | 0x00000001 (the version of the specification)     |
>>>> +| `version` | The version of the specification, defined values: |
>>>> +|           | 0x00000001: all fields without any explicitly     |
>>>> +|           |             mentioned version dependency are      |
>>>> +|           |             valid.                                |
>>>> +|           | 0x00000002: all fields valid for version 1 plus   |
>>>> +|           |             fields explicitly stated to be        |
>>>> +|           |             supported in version 2 are valid.     |
>>>
>>> I am a bit concerned with the bump of the versions. It means, it will be 
>>> necessary for Xenstored to know whether the new Xenstored speaks v1 or v2. 
>>> This is less an issue when Live-Migration (although there is a fleet 
>>> management problem) but it will be one for Live-Update if we need to rollback.
>>>
>>> So I am wondering if we can avoid to bump the version and use other means to 
>>> detect the difference.
>>
>> In the end this is exactly what the version was meant to be used for.
>>
>> I think it would make much more sense to think about the way to handle a
>> bump of the version in a compatible way.
>>
>> My idea was to add a xenstored command line parameter for limiting the
>> migration stream version to be used to a specified version, causing new
>> features probably to not be available, though.
> 
> I think this is fine. Someone that cares about rollback will also likely care 
> about fleet diversity. So they will want to avoid enabling a feature until they 
> know it can work everywhere.

That was the idea, yes.

> 
>>
>> I don't see how e.g. a rollback would be doable in case a domain already
>> started to use a new feature like the third parameter when setting a watch.
>> Even if we'd drop the depth information when rolling back a watch set
>> afterwards with an additional depth added would be rejected by the older
>> xenstored, which would be unexpected failure for the guest.
> 
> See above.
> 
>>
>> It might make sense to try to use the V1 stream when doing a live update,
>> e.g. covering the case when the SET_FEATURE command was used for each
>> active guest to limit the features to V1 compatible ones. A force parameter
>> might be used to use the V1 stream even if guests are using V2 features,
>> risking breakage of those guests.
> 
> I don't have a strong opinion on this yet. I might have some when seen the code :).
> 
> [...]
> 
>> This would even be possible using the global record of V1, as
>> the length information of the record allows to add new fields without
>> having to bump the version.
> 
> I was actually thinking about this when writing the e-mail last week. There are 
> no dynamic length array in the global records so far, so using the length 
> information would be ok. I am more concerned about the others because we are 
> mixing fixed and dynamic length.
> 
> This means it is more difficult to read the code and the layout.
> 
>>
>>>
>>>> +| `n-glob-quota` | Number of quota values which apply globally  |
>>>> +|                | only. Valid only for version 2 and later.    |
>>>> +|                |                                              |
>>>> +| `quota-val`    | Quota values, first the ones applying per    |
>>>> +|                | domain, then the ones applying globally. A   |
>>>> +|                | value of 0 has the semantics of "unlimited". |
>>>> +|                | Valid only for version 2 and later.          |
>>>> +|                |                                              |
>>>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>>>> +|                | the same sequence as the `quota-val` values. | > 
>>>> +|                | Valid only for version 2 and later.          |
>>>
>>>  From my understanding, both version of Xenstored needs to agree on the quota 
>>> names. So it means the name have to be defined as part of the spec. At which 
>>> point, I think it would be better to use ID.
>>
>> I don't think so. For one the minimal set of quota names has been defined
>> already in patch 3.
> 
> Someone reading the migration stream will not necessarily read the Xenstore 
> protocol. So I think we should either make them explicit in the documentation or 
> have a link to the other document.

I'll add a link to the other doc.

> 
>> And even with using an ID you'd have the same problem
>> again, but without having the possibility to add variant specific quota
> 
> Fair enough.
> 
>> (remember that there already has been a statement that doing a live update
>> from C to OCAML or vice versa would probably break users due to some
>> deviations in behavior)
> I can't find such statement in public documentation. Do you have a link?

No, not really. IIRC this was a remark of Andrew back when we introduced
LU of Xenstore. It might even have been at the time the discussion was
only on the security ML.

Note that "users" doesn't need to imply guests, it might be related to
dom0 side users, like e.g. XAPI.

> That said, a guest doesn't have a (easy?) way to know how Xenstored is 
> implemented. So it is quite concerning to hear some of them may rely on 
> behaviors. How did that happen?

Lack of communication between OCAML and C Xenstore maintainers back then?

> 
>>
>>> Also, can you clarify what would happen if the stream contains a quota not 
>>> supported by the new Xenstored?
>>
>> Yes, I'll add a sentence that those should be ignored.
>>
>>>
>>>> +
>>>>   xenstored will resume in the original process context. Hence `rw-socket-fd`
>>>>   simply specifies the file descriptor of the socket. Sockets are not always
>>>> @@ -145,7 +177,7 @@ the domain being migrated.
>>>>   ```
>>>>       0       1       2       3       4       5       6       7    octet
>>>>   +-------+-------+-------+-------+-------+-------+-------+-------+
>>>> -| conn-id                       | conn-type     |               |
>>>> +| conn-id                       | conn-type     | n-quota       |
>>>>   +-------------------------------+---------------+---------------+
>>>>   | conn-spec
>>>>   ...
>>>> @@ -154,6 +186,17 @@ the domain being migrated.
>>>>   +---------------+---------------+-------------------------------+
>>>>   | data
>>>>   ...
>>>> ++-------------------------------+
>>>> +| features                      |
>>>> ++-------------------------------+
>>>> +| quota-val 1                   |
>>>> ++-------------------------------+
>>>> +...
>>>> ++-------------------------------+
>>>> +| quota-val N                   |
>>>> ++-------------------------------+
>>>> +| quota-names
>>>> +...
>>>>   ```
>>>> @@ -167,6 +210,10 @@ the domain being migrated.
>>>>   |                | 0x0001: socket                               |
>>>>   |                | 0x0002 - 0xFFFF: reserved for future use     |
>>>>   |                |                                              |
>>>> +| `n-quota`      | Number of quota values.                      |
>>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>>> +|                | Only valid for version 2 and later.          |
>>>> +|                |                                              |
>>>>   | `conn-spec`    | See below                                    |
>>>>   |                |                                              |
>>>>   | `in-data-len`  | The length (in octets) of any data read      |
>>>> @@ -182,6 +229,22 @@ the domain being migrated.
>>>>   | `data`         | Pending data: first in-data-len octets of    |
>>>>   |                | read data, then out-data-len octets of       |
>>>>   |                | written data (any of both may be empty)      |
>>>> +|                |                                              |
>>>> +| `features`     | Value of the feature field visible by the    |
>>>> +|                | guest at offset 2064 of the ring page.       |
>>>> +|                | Aligned to the next 4 octet boundary.        |
>>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>>
>>> For the purpose of the stream, I would consider to make it available for the 
>>> socket connection. This could potentially be used in the future to allow each 
>>> application to have a different behavior when socket is used.
>>
>> This would break the use of xenstore-stubdom for such a setup.
> 
> I am not sure why it would break the use of xenstore-stubdom. An application 
> will already need to cope with the case Xenstored doesn't support a feature.

Someone relying to be able to switch off a feature on a socket connection
might get into trouble trying to do the same when running with xenstore-stubdom.
Switching a feature off will either not work, or switch the feature off for all
dom0 connections (which is a single one, of course).

> At which point, it would be easy to say "I don't want this feature" when using a 
> socket.

I don't see the value of that. If you don't want a feature, just don't use it.

> 
>>
>>> I can't make my mind yet if we can avoid bumping the version for this field. 
>>> What would happen if we need to rollback?
>>
>> I think an active usage of the new features and a rollback are mutually
>> exclusive. See above.
>>>> +|                |                                              |
>>>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>>>> +|                | the same sequence as the `quota-val` values. |
>>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>>> +|                | Only valid for version 2 and later.          |
>>>
>>> As for the "global" quotas, I would move the quotas in a separate record. In 
>>> this case, this would also be useful to avoid having may dynamic length field 
>>> within the same record.
>>
>> I like having the data together more.
> 
> Which is fine so long the code doesn't become too horrible to read/maintain. I 
> think having dynamic length array in the middle of the record makes it trickier.
> 
> This will only become worse as we introduce new fields in newer revision. So at 
> which point would you say the record has grown too much?
> 
> To me, this is already the point and we have plenty of record ID to handle that.

Fair enough.

Other questions arising from that:

- Should we have different record types for global and per-domain quota?

- Should we split global quota into two record types (per-domain settable
   and global acting ones)?

- Combination of above (one record type for per-domain quota, usable for
   global default with "invalid domid", and one record type for the global
   acting ones like max. path-length)?

> 
>>>
>>>>   In case of live update the connection record for the connection via which
>>>>   the live update command was issued will contain the response for the live
>>>> @@ -247,7 +310,7 @@ by a connection for which there is `CONNECTION_DATA` 
>>>> record previously present.
>>>>   ```
>>>>       0       1       2       3    octet
>>>> -+-------+-------+-------+-------+
>>>> ++---------------+---------------+
>>>>   | conn-id                       |
>>>>   +---------------+---------------+
>>>>   | wpath-len     | token-len     |
>>>> @@ -256,6 +319,9 @@ by a connection for which there is `CONNECTION_DATA` 
>>>> record previously present.
>>>>   ...
>>>>   | token
>>>>   ...
>>>> ++---------------+---------------+
>>>> +| depth         |               |
>>>> ++---------------+---------------+
>>>>   ```
>>>> @@ -275,6 +341,13 @@ by a connection for which there is `CONNECTION_DATA` 
>>>> record previously present.
>>>>   |             |                                                 |
>>>>   | `token`     | The watch identifier token, as specified in the |
>>>>   |             | `WATCH` operation                               |
>>>> +|             |                                                 |
>>>> +| `depth`     | The number of directory levels below the        |
>>>> +|             | watched path to consider for a match. This      |
>>>> +|             | field is aligned to the next 4 octet boundary.  |
>>>> +|             | A value of 0xffff is used for unlimited depth.  |
>>>> +|             | This field is valid only for version 2 and      |
>>>> +|             | higher.                                         |
>>>
>>> If we are going to bump the stream version, then I think we should move the 
>>> field before token/path.
>>
>> I thought about that, but liked it better to be able to keep a common struct
>> layout for the record with the V2 fields being at the end.
>>
>> Main reason is the ability to avoid duplication of code for being able to
>> handle both versions.
> 
> The cons is you can't easily describe the record in "struct ...". As I wrote 
> above, I think have dynamic length array in the middle of a record is wrong.

You've got a point here.

> I have looked at the code, I don't think there will be enough code duplication 
> to warrant adding fixed field at the end of the record.

Okay, lets go with a new record type then.

Should that always be used, or only if the depth information has been
specified (IOW: is the old watch record format invalid in V2)?


Juergen
Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Julien Grall 3 years, 6 months ago
Hi Juergen,

On 08/08/2022 12:31, Juergen Gross wrote:
>>> And even with using an ID you'd have the same problem
>>> again, but without having the possibility to add variant specific quota
>>
>> Fair enough.
>>
>>> (remember that there already has been a statement that doing a live 
>>> update
>>> from C to OCAML or vice versa would probably break users due to some
>>> deviations in behavior)
>> I can't find such statement in public documentation. Do you have a link?
> 
> No, not really. IIRC this was a remark of Andrew back when we introduced
> LU of Xenstore. It might even have been at the time the discussion was
> only on the security ML.
> 
> Note that "users" doesn't need to imply guests, it might be related to
> dom0 side users, like e.g. XAPI.

I understand that "users" doesn't imply guests. Hovewer, it is still not 
quite clear to me what sort of behavior the application could rely on 
here. Is it related to the transaction?

Anyway, I think it would be good to document such behaviors because they 
don't seem to be obvious.

Cheers,

-- 
Julien Grall
Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Julien Grall 3 years, 6 months ago
Hi Juergen,

On 08/08/2022 12:31, Juergen Gross wrote:
> On 08.08.22 13:00, Julien Grall wrote:
>>> This would break the use of xenstore-stubdom for such a setup.
>>
>> I am not sure why it would break the use of xenstore-stubdom. An 
>> application will already need to cope with the case Xenstored doesn't 
>> support a feature.
> 
> Someone relying to be able to switch off a feature on a socket connection
> might get into trouble trying to do the same when running with 
> xenstore-stubdom.

This is not very different from an application that was built against an 
old Xenstored and would not be capable to talk properly with the new 
Xenstored if the feature is enabled. I understand that...

> Switching a feature off will either not work, or switch the feature off 
> for all
> dom0 connections (which is a single one, of course).

... when using xenstore-stubdom xenstored it means that the feature will 
have to be disable for all dom0 connections.

However, it seems unlikely to me that someone will switch to a 
xenstore-stubdom on a whim because there are also scalability concerns 
(one ring to rule all connections). So I think it would be fair to say 
that your application may need to be tweak if you are not using the same 
feature as the system.

> 
>> At which point, it would be easy to say "I don't want this feature" 
>> when using a socket.
> 
> I don't see the value of that. If you don't want a feature, just don't 
> use it.

This is not that simple. Your assumption is the feature will not change 
the behavior exposed to the application.

I don't think we have such feature today but I don't see what prevents 
us to do that.

>>>> I can't make my mind yet if we can avoid bumping the version for 
>>>> this field. What would happen if we need to rollback?
>>>
>>> I think an active usage of the new features and a rollback are mutually
>>> exclusive. See above.
>>>>> +|                |                                              |
>>>>> +| `quota-names`  | 0 delimited strings of the quota names in    |
>>>>> +|                | the same sequence as the `quota-val` values. |
>>>>> +|                | Only for `conn-type` 0 (shared ring).        |
>>>>> +|                | Only valid for version 2 and later.          |
>>>>
>>>> As for the "global" quotas, I would move the quotas in a separate 
>>>> record. In this case, this would also be useful to avoid having may 
>>>> dynamic length field within the same record.
>>>
>>> I like having the data together more.
>>
>> Which is fine so long the code doesn't become too horrible to 
>> read/maintain. I think having dynamic length array in the middle of 
>> the record makes it trickier.
>>
>> This will only become worse as we introduce new fields in newer 
>> revision. So at which point would you say the record has grown too much?
>>
>> To me, this is already the point and we have plenty of record ID to 
>> handle that.
> 
> Fair enough.
> 
> Other questions arising from that:
> 
> - Should we have different record types for global and per-domain quota?

Given the question below, I guess you mean per-domain quotas that are 
not the default ones. If yes, then they should be split.

> 
> - Should we split global quota into two record types (per-domain settable
>    and global acting ones)?

I don't have a strong opinion on this.

> 
> - Combination of above (one record type for per-domain quota, usable for
>    global default with "invalid domid", and one record type for the global
>    acting ones like max. path-length)?

[...]

>>> I thought about that, but liked it better to be able to keep a common 
>>> struct
>>> layout for the record with the V2 fields being at the end.
>>>
>>> Main reason is the ability to avoid duplication of code for being 
>>> able to
>>> handle both versions.
>>
>> The cons is you can't easily describe the record in "struct ...". As I 
>> wrote above, I think have dynamic length array in the middle of a 
>> record is wrong.
> 
> You've got a point here.
> 
>> I have looked at the code, I don't think there will be enough code 
>> duplication to warrant adding fixed field at the end of the record.
> 
> Okay, lets go with a new record type then.
> 
> Should that always be used, or only if the depth information has been
> specified (IOW: is the old watch record format invalid in V2)?

I don't have a strong opinion.

Cheers,

-- 
Julien Grall

Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Juergen Gross 3 years, 6 months ago
On 12.08.22 11:13, Julien Grall wrote:
> Hi Juergen,
> 
> On 08/08/2022 12:31, Juergen Gross wrote:
>> On 08.08.22 13:00, Julien Grall wrote:
>>>> This would break the use of xenstore-stubdom for such a setup.
>>>
>>> I am not sure why it would break the use of xenstore-stubdom. An application 
>>> will already need to cope with the case Xenstored doesn't support a feature.
>>
>> Someone relying to be able to switch off a feature on a socket connection
>> might get into trouble trying to do the same when running with xenstore-stubdom.
> 
> This is not very different from an application that was built against an old 
> Xenstored and would not be capable to talk properly with the new Xenstored if 
> the feature is enabled. I understand that...
> 
>> Switching a feature off will either not work, or switch the feature off for all
>> dom0 connections (which is a single one, of course).
> 
> ... when using xenstore-stubdom xenstored it means that the feature will have to 
> be disable for all dom0 connections.

Wait, I don't think we can ever add features which will change the behavior of
Xenstore when those new features aren't being used actively. The new features
should always be on top of the existing functionality.

> However, it seems unlikely to me that someone will switch to a xenstore-stubdom 
> on a whim because there are also scalability concerns (one ring to rule all 
> connections). So I think it would be fair to say that your application may need 
> to be tweak if you are not using the same feature as the system.

No, this should never be the case IMO. See above.

> 
>>
>>> At which point, it would be easy to say "I don't want this feature" when 
>>> using a socket.
>>
>> I don't see the value of that. If you don't want a feature, just don't use it.
> 
> This is not that simple. Your assumption is the feature will not change the 
> behavior exposed to the application.

Correct.

> 
> I don't think we have such feature today but I don't see what prevents us to do 
> that.

Compatibility prevents us doing that.

If we want different behavior, we need to use different or extended commands
(e.g. like the additional "depth" parameter of SET_WATCH).


Juergen
Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features
Posted by Julien Grall 3 years, 6 months ago
Hi Juergen,

On 12/08/2022 11:56, Juergen Gross wrote:
> On 12.08.22 11:13, Julien Grall wrote:
>> Hi Juergen,
>>
>> On 08/08/2022 12:31, Juergen Gross wrote:
>>> On 08.08.22 13:00, Julien Grall wrote:
>>>>> This would break the use of xenstore-stubdom for such a setup.
>>>>
>>>> I am not sure why it would break the use of xenstore-stubdom. An 
>>>> application will already need to cope with the case Xenstored 
>>>> doesn't support a feature.
>>>
>>> Someone relying to be able to switch off a feature on a socket 
>>> connection
>>> might get into trouble trying to do the same when running with 
>>> xenstore-stubdom.
>>
>> This is not very different from an application that was built against 
>> an old Xenstored and would not be capable to talk properly with the 
>> new Xenstored if the feature is enabled. I understand that...
>>
>>> Switching a feature off will either not work, or switch the feature 
>>> off for all
>>> dom0 connections (which is a single one, of course).
>>
>> ... when using xenstore-stubdom xenstored it means that the feature 
>> will have to be disable for all dom0 connections.
> 
> Wait, I don't think we can ever add features which will change the 
> behavior of
> Xenstore when those new features aren't being used actively.
That would be fine if you know that your client can support it. Reading 
the rest of the e-mail, AFAIU your aim is to use SET_FEATURES to 
indicate which features is supported by Xenstored. This may or may not 
be supported by the client.

When I replied, I had a different idea in mind for SET_FEATURES. But I 
think it wouldn't work in a generic setup because an handshake would be 
necessary (the client would need to advertise the features it supports).

Anyway, now I understand your point and agree with you the we want to 
only have the field 'features' for ring connection.

Thanks for the clarification!

Cheers,

-- 
Julien Grall