Multiple interface support on top of Multi-FD

[PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Het Gala 1 year, 11 months ago

As of now, the multi-FD feature supports connection over the default network
only. This Patchset series is a Qemu side implementation of providing multiple
interfaces support for multi-FD. This enables us to fully utilize dedicated or
multiple NICs in case bonding of NICs is not possible.


Introduction
-------------
Multi-FD Qemu implementation currently supports connection only on the default
network. This forbids us from advantages like:
- Separating VM live migration traffic from the default network.
- Fully utilize all NICs’ capacity in cases where creating a LACP bond (Link
  Aggregation Control Protocol) is not supported.

Multi-interface with Multi-FD
-----------------------------
Multiple-interface support over basic multi-FD has been implemented in the
patches. Advantages of this implementation are:
- Able to separate live migration traffic from default network interface by
  creating multiFD channels on ip addresses of multiple non-default interfaces.
- Can optimize the number of multi-FD channels on a particular interface
  depending upon the network bandwidth limit on a particular interface.

Implementation
--------------

Earlier the 'migrate' qmp command:
{ "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } }

Modified qmp command:
{ "execute": "migrate",
             "arguments": { "uri": "tcp:0:4446", "multi-fd-uri-list": [ {
             "source-uri": "tcp::6900", "destination-uri": "tcp:0:4480",
             "multifd-channels": 4}, { "source-uri": "tcp:10.0.0.0: ",
             "destination-uri": "tcp:11.0.0.0:7789",
             "multifd-channels": 5} ] } }
------------------------------------------------------------------------------

Earlier the 'migrate-incoming' qmp command:
{ "execute": "migrate-incoming", "arguments": { "uri": "tcp::4446" } }

Modified 'migrate-incoming' qmp command:
{ "execute": "migrate-incoming",
            "arguments": {"uri": "tcp::6789",
            "multi-fd-uri-list" : [ {"destination-uri" : "tcp::6900",
            "multifd-channels": 4}, {"destination-uri" : "tcp:11.0.0.0:7789",
            "multifd-channels": 5} ] } }
------------------------------------------------------------------------------

Introduced a new flag while spawning a qemu process for 'migrate-incoming' ip
addresses (-multi-fd-incoming flag):
-multi-fd-incoming "tcp::6900:4,tcp:11.0.0.0:7789:5"

---
Het Gala (4):
  Modifying ‘migrate’ qmp command to add multi-FD socket on particular
    source and destination pair
  Adding multi-interface support for multi-FD on destination side
  Establishing connection between any non-default source and destination
    pair
  Adding support for multi-FD connections dynamically

 chardev/char-socket.c               |   4 +-
 include/io/channel-socket.h         |  26 ++--
 include/qapi/util.h                 |  10 ++
 include/qemu/sockets.h              |   6 +-
 io/channel-socket.c                 |  50 +++++--
 migration/migration.c               | 211 ++++++++++++++++++++++------
 migration/migration.h               |   3 +-
 migration/multifd.c                 |  42 +++---
 migration/socket.c                  | 119 ++++++++++++----
 migration/socket.h                  |  24 +++-
 monitor/hmp-cmds.c                  |  68 ++++-----
 nbd/client-connection.c             |   2 +-
 qapi/migration.json                 |  92 +++++++++---
 qapi/qapi-util.c                    |  27 ++++
 qemu-nbd.c                          |   4 +-
 qemu-options.hx                     |  18 +++
 scsi/pr-manager-helper.c            |   1 +
 softmmu/vl.c                        |  30 +++-
 tests/unit/test-char.c              |   8 +-
 tests/unit/test-io-channel-socket.c |   4 +-
 tests/unit/test-util-sockets.c      |  16 +--
 ui/input-barrier.c                  |   2 +-
 ui/vnc.c                            |   3 +-
 util/qemu-sockets.c                 |  71 +++++++---
 24 files changed, 626 insertions(+), 215 deletions(-)

-- 
2.22.3

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Daniel P. Berrangé 1 year, 11 months ago

On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> 
> As of now, the multi-FD feature supports connection over the default network
> only. This Patchset series is a Qemu side implementation of providing multiple
> interfaces support for multi-FD. This enables us to fully utilize dedicated or
> multiple NICs in case bonding of NICs is not possible.
> 
> 
> Introduction
> -------------
> Multi-FD Qemu implementation currently supports connection only on the default
> network. This forbids us from advantages like:
> - Separating VM live migration traffic from the default network.

Perhaps I'm mis-understanding your intent here, but AFAIK it
has been possible to separate VM migration traffic from general
host network traffic essentially forever.

If you have two NICs with IP addresses on different subnets,
then the kernel will pick which NIC to use automatically
based on the IP address of the target matching the kernel
routing table entries.

Management apps have long used this ability in order to
control which NIC migration traffic flows over.

> - Fully utilize all NICs’ capacity in cases where creating a LACP bond (Link
>   Aggregation Control Protocol) is not supported.

Can you elaborate on scenarios in which it is impossible to use LACP
bonding at the kernel level ?

> Multi-interface with Multi-FD
> -----------------------------
> Multiple-interface support over basic multi-FD has been implemented in the
> patches. Advantages of this implementation are:
> - Able to separate live migration traffic from default network interface by
>   creating multiFD channels on ip addresses of multiple non-default interfaces.
> - Can optimize the number of multi-FD channels on a particular interface
>   depending upon the network bandwidth limit on a particular interface.

Manually assigning individual channels to different NICs is a pretty
inefficient way to optimizing traffic. Feels like you could easily get
into a situation where one NIC ends up idle while the other is busy,
especially if the traffic patterns are different. For example with
post-copy there's an extra channel for OOB async page requests, and
its far from clear that manually picking NICs per chanel upfront is
going work for that.  The kernel can continually dynamically balance
load on the fly and so do much better than any static mapping QEMU
tries to apply, especially if there are multiple distinct QEMU's
competing for bandwidth.

> Implementation
> --------------
> 
> Earlier the 'migrate' qmp command:
> { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } }
> 
> Modified qmp command:
> { "execute": "migrate",
>              "arguments": { "uri": "tcp:0:4446", "multi-fd-uri-list": [ {
>              "source-uri": "tcp::6900", "destination-uri": "tcp:0:4480",
>              "multifd-channels": 4}, { "source-uri": "tcp:10.0.0.0: ",
>              "destination-uri": "tcp:11.0.0.0:7789",
>              "multifd-channels": 5} ] } }

> ------------------------------------------------------------------------------
> 
> Earlier the 'migrate-incoming' qmp command:
> { "execute": "migrate-incoming", "arguments": { "uri": "tcp::4446" } }
> 
> Modified 'migrate-incoming' qmp command:
> { "execute": "migrate-incoming",
>             "arguments": {"uri": "tcp::6789",
>             "multi-fd-uri-list" : [ {"destination-uri" : "tcp::6900",
>             "multifd-channels": 4}, {"destination-uri" : "tcp:11.0.0.0:7789",
>             "multifd-channels": 5} ] } }
> ------------------------------------------------------------------------------

These examples pretty nicely illustrate my concern with this
proposal. It is making QEMU configuration of migration
massively more complicated, while duplicating functionality
the kernel can provide via NIC teaming, but without having
ability to balance it on the fly as the kernel would.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by manish.mishra 1 year, 11 months ago

On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
>> As of now, the multi-FD feature supports connection over the default network
>> only. This Patchset series is a Qemu side implementation of providing multiple
>> interfaces support for multi-FD. This enables us to fully utilize dedicated or
>> multiple NICs in case bonding of NICs is not possible.
>>
>>
>> Introduction
>> -------------
>> Multi-FD Qemu implementation currently supports connection only on the default
>> network. This forbids us from advantages like:
>> - Separating VM live migration traffic from the default network.

Hi Daniel,

I totally understand your concern around this approach increasing compexity inside qemu,

when similar things can be done with NIC teaming. But we thought this approach provides

much more flexibility to user in few cases like.

1. We checked our customer data, almost all of the host had multiple NIC, but LACP support

     in their setups was very rare. So for those cases this approach can help in utilise multiple

     NICs as teaming is not possible there.

2. We have seen requests recently to separate out traffic of storage, VM netwrok, migration

     over different vswitch which can be backed by 1 or more NICs as this give better

     predictability and assurance. So host with multiple ips/vswitches can be very common

     environment. In this kind of enviroment this approach gives per vm or migration level

     flexibilty, like for critical VM we can still use bandwidth from all available vswitch/interface

     but for normal VM they can keep live migration only on dedicated NICs without changing

     complete host network topology.

     At final we want it to be something like this [<ip-pair>, <multiFD-channels>, <bandwidth_control>]

     to provide bandwidth_control per interface.

3. Dedicated NIC we mentioned as a use case, agree with you it can be done without this

     approach too.

> Perhaps I'm mis-understanding your intent here, but AFAIK it
> has been possible to separate VM migration traffic from general
> host network traffic essentially forever.
>
> If you have two NICs with IP addresses on different subnets,
> then the kernel will pick which NIC to use automatically
> based on the IP address of the target matching the kernel
> routing table entries.
>
> Management apps have long used this ability in order to
> control which NIC migration traffic flows over.
>
>> - Fully utilize all NICs’ capacity in cases where creating a LACP bond (Link
>>    Aggregation Control Protocol) is not supported.
> Can you elaborate on scenarios in which it is impossible to use LACP
> bonding at the kernel level ?
Yes, as mentioned above LACP support was rare in customer setups.
>> Multi-interface with Multi-FD
>> -----------------------------
>> Multiple-interface support over basic multi-FD has been implemented in the
>> patches. Advantages of this implementation are:
>> - Able to separate live migration traffic from default network interface by
>>    creating multiFD channels on ip addresses of multiple non-default interfaces.
>> - Can optimize the number of multi-FD channels on a particular interface
>>    depending upon the network bandwidth limit on a particular interface.
> Manually assigning individual channels to different NICs is a pretty
> inefficient way to optimizing traffic. Feels like you could easily get
> into a situation where one NIC ends up idle while the other is busy,
> especially if the traffic patterns are different. For example with
> post-copy there's an extra channel for OOB async page requests, and
> its far from clear that manually picking NICs per chanel upfront is
> going work for that.  The kernel can continually dynamically balance
> load on the fly and so do much better than any static mapping QEMU
> tries to apply, especially if there are multiple distinct QEMU's
> competing for bandwidth.
>
Yes, Daniel current solution is only for pre-copy. As with postcopy

multiFD is not yet supported but in future we can extend it for postcopy

channels too.

>> Implementation
>> --------------
>>
>> Earlier the 'migrate' qmp command:
>> { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } }
>>
>> Modified qmp command:
>> { "execute": "migrate",
>>               "arguments": { "uri": "tcp:0:4446", "multi-fd-uri-list": [ {
>>               "source-uri": "tcp::6900", "destination-uri": "tcp:0:4480",
>>               "multifd-channels": 4}, { "source-uri": "tcp:10.0.0.0: ",
>>               "destination-uri": "tcp:11.0.0.0:7789",
>>               "multifd-channels": 5} ] } }
>> ------------------------------------------------------------------------------
>>
>> Earlier the 'migrate-incoming' qmp command:
>> { "execute": "migrate-incoming", "arguments": { "uri": "tcp::4446" } }
>>
>> Modified 'migrate-incoming' qmp command:
>> { "execute": "migrate-incoming",
>>              "arguments": {"uri": "tcp::6789",
>>              "multi-fd-uri-list" : [ {"destination-uri" : "tcp::6900",
>>              "multifd-channels": 4}, {"destination-uri" : "tcp:11.0.0.0:7789",
>>              "multifd-channels": 5} ] } }
>> ------------------------------------------------------------------------------
> These examples pretty nicely illustrate my concern with this
> proposal. It is making QEMU configuration of migration
> massively more complicated, while duplicating functionality
> the kernel can provide via NIC teaming, but without having
> ability to balance it on the fly as the kernel would.

Yes, agree Daniel this raises complexity but we will make sure that it does not

change/imapct anything existing and we provide new options as optional.

Few of the things about this may not be taken care currently as this was to

get some early feedback but we will definately modify that.

>
> With regards,
> Daniel

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Daniel P. Berrangé 1 year, 11 months ago

On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
> 
> On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> > On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> > > As of now, the multi-FD feature supports connection over the default network
> > > only. This Patchset series is a Qemu side implementation of providing multiple
> > > interfaces support for multi-FD. This enables us to fully utilize dedicated or
> > > multiple NICs in case bonding of NICs is not possible.
> > > 
> > > 
> > > Introduction
> > > -------------
> > > Multi-FD Qemu implementation currently supports connection only on the default
> > > network. This forbids us from advantages like:
> > > - Separating VM live migration traffic from the default network.
> 
> Hi Daniel,
> 
> I totally understand your concern around this approach increasing compexity inside qemu,
> 
> when similar things can be done with NIC teaming. But we thought this approach provides
> 
> much more flexibility to user in few cases like.
> 
> 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
> 
>     in their setups was very rare. So for those cases this approach can help in utilise multiple
> 
>     NICs as teaming is not possible there.

AFAIK,  LACP is not required in order to do link aggregation with Linux.
Traditional Linux bonding has no special NIC hardware or switch requirements,
so LACP is merely a "nice to have" in order to simplify some aspects.

IOW, migration with traffic spread across multiple NICs is already
possible AFAICT.

I can understand that some people may not have actually configured
bonding on their hosts, but it is not unreasonable to request that
they do so, if they want to take advantage fo aggrated bandwidth.

It has the further benefit that it will be fault tolerant. With
this proposal if any single NIC has a problem, the whole migration
will get stuck. With kernel level bonding, if any single NIC haus
a problem, it'll get offlined by the kernel and migration will
continue to  work across remaining active NICs.

> 2. We have seen requests recently to separate out traffic of storage, VM netwrok, migration
> 
>     over different vswitch which can be backed by 1 or more NICs as this give better
> 
>     predictability and assurance. So host with multiple ips/vswitches can be very common
> 
>     environment. In this kind of enviroment this approach gives per vm or migration level
> 
>     flexibilty, like for critical VM we can still use bandwidth from all available vswitch/interface
> 
>     but for normal VM they can keep live migration only on dedicated NICs without changing
> 
>     complete host network topology.
> 
>     At final we want it to be something like this [<ip-pair>, <multiFD-channels>, <bandwidth_control>]
> 
>     to provide bandwidth_control per interface.

Again, it is already possible to separate migration traffic from storage
traffic, from other network traffic. The target IP given will influence
which NIC is used based on routing table and I know this is already
done widely with OpenStack deployments.

> 3. Dedicated NIC we mentioned as a use case, agree with you it can be done without this
> 
>     approach too.


> > > Multi-interface with Multi-FD
> > > -----------------------------
> > > Multiple-interface support over basic multi-FD has been implemented in the
> > > patches. Advantages of this implementation are:
> > > - Able to separate live migration traffic from default network interface by
> > >    creating multiFD channels on ip addresses of multiple non-default interfaces.
> > > - Can optimize the number of multi-FD channels on a particular interface
> > >    depending upon the network bandwidth limit on a particular interface.
> > Manually assigning individual channels to different NICs is a pretty
> > inefficient way to optimizing traffic. Feels like you could easily get
> > into a situation where one NIC ends up idle while the other is busy,
> > especially if the traffic patterns are different. For example with
> > post-copy there's an extra channel for OOB async page requests, and
> > its far from clear that manually picking NICs per chanel upfront is
> > going work for that.  The kernel can continually dynamically balance
> > load on the fly and so do much better than any static mapping QEMU
> > tries to apply, especially if there are multiple distinct QEMU's
> > competing for bandwidth.
> > 
> Yes, Daniel current solution is only for pre-copy. As with postcopy
> multiFD is not yet supported but in future we can extend it for postcopy
> 
> channels too.
> 
> > > Implementation
> > > --------------
> > > 
> > > Earlier the 'migrate' qmp command:
> > > { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } }
> > > 
> > > Modified qmp command:
> > > { "execute": "migrate",
> > >               "arguments": { "uri": "tcp:0:4446", "multi-fd-uri-list": [ {
> > >               "source-uri": "tcp::6900", "destination-uri": "tcp:0:4480",
> > >               "multifd-channels": 4}, { "source-uri": "tcp:10.0.0.0: ",
> > >               "destination-uri": "tcp:11.0.0.0:7789",
> > >               "multifd-channels": 5} ] } }
> > > ------------------------------------------------------------------------------
> > > 
> > > Earlier the 'migrate-incoming' qmp command:
> > > { "execute": "migrate-incoming", "arguments": { "uri": "tcp::4446" } }
> > > 
> > > Modified 'migrate-incoming' qmp command:
> > > { "execute": "migrate-incoming",
> > >              "arguments": {"uri": "tcp::6789",
> > >              "multi-fd-uri-list" : [ {"destination-uri" : "tcp::6900",
> > >              "multifd-channels": 4}, {"destination-uri" : "tcp:11.0.0.0:7789",
> > >              "multifd-channels": 5} ] } }
> > > ------------------------------------------------------------------------------
> > These examples pretty nicely illustrate my concern with this
> > proposal. It is making QEMU configuration of migration
> > massively more complicated, while duplicating functionality
> > the kernel can provide via NIC teaming, but without having
> > ability to balance it on the fly as the kernel would.
> 
> Yes, agree Daniel this raises complexity but we will make sure that it does not
> 
> change/imapct anything existing and we provide new options as optional.

The added code is certainly going to impact ongoing maint of QEMU I/O
layer and migration in particular. I'm not convinced this complexity
is compelling enough compared to leveraging kernel native bonding
to justify the maint burden it will impose.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Daniel P. Berrangé 1 year, 11 months ago

On Wed, Jun 15, 2022 at 05:43:28PM +0100, Daniel P. Berrangé wrote:
> On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
> > 
> > On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> > > On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> > > > As of now, the multi-FD feature supports connection over the default network
> > > > only. This Patchset series is a Qemu side implementation of providing multiple
> > > > interfaces support for multi-FD. This enables us to fully utilize dedicated or
> > > > multiple NICs in case bonding of NICs is not possible.
> > > > 
> > > > 
> > > > Introduction
> > > > -------------
> > > > Multi-FD Qemu implementation currently supports connection only on the default
> > > > network. This forbids us from advantages like:
> > > > - Separating VM live migration traffic from the default network.
> > 
> > Hi Daniel,
> > 
> > I totally understand your concern around this approach increasing compexity inside qemu,
> > 
> > when similar things can be done with NIC teaming. But we thought this approach provides
> > 
> > much more flexibility to user in few cases like.
> > 
> > 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
> > 
> >     in their setups was very rare. So for those cases this approach can help in utilise multiple
> > 
> >     NICs as teaming is not possible there.
> 
> AFAIK,  LACP is not required in order to do link aggregation with Linux.
> Traditional Linux bonding has no special NIC hardware or switch requirements,
> so LACP is merely a "nice to have" in order to simplify some aspects.
> 
> IOW, migration with traffic spread across multiple NICs is already
> possible AFAICT.
> 
> I can understand that some people may not have actually configured
> bonding on their hosts, but it is not unreasonable to request that
> they do so, if they want to take advantage fo aggrated bandwidth.
> 
> It has the further benefit that it will be fault tolerant. With
> this proposal if any single NIC has a problem, the whole migration
> will get stuck. With kernel level bonding, if any single NIC haus
> a problem, it'll get offlined by the kernel and migration will
> continue to  work across remaining active NICs.
> 
> > 2. We have seen requests recently to separate out traffic of storage, VM netwrok, migration
> > 
> >     over different vswitch which can be backed by 1 or more NICs as this give better
> > 
> >     predictability and assurance. So host with multiple ips/vswitches can be very common
> > 
> >     environment. In this kind of enviroment this approach gives per vm or migration level
> > 
> >     flexibilty, like for critical VM we can still use bandwidth from all available vswitch/interface
> > 
> >     but for normal VM they can keep live migration only on dedicated NICs without changing
> > 
> >     complete host network topology.
> > 
> >     At final we want it to be something like this [<ip-pair>, <multiFD-channels>, <bandwidth_control>]
> > 
> >     to provide bandwidth_control per interface.
> 
> Again, it is already possible to separate migration traffic from storage
> traffic, from other network traffic. The target IP given will influence
> which NIC is used based on routing table and I know this is already
> done widely with OpenStack deployments.

Actually I should clarify this is only practical if the two NICs are
using different IP subnets, otherwise routing rules are not viable.
So needing to set source IP would be needed to select between a pair
of NICs on the same IP subnet.

Previous usage I've seen has always setup fully distinct IP subnets
for generic vs storage vs migration network traffic.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Dr. David Alan Gilbert 1 year, 11 months ago

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Wed, Jun 15, 2022 at 05:43:28PM +0100, Daniel P. Berrangé wrote:
> > On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
> > > 
> > > On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> > > > On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> > > > > As of now, the multi-FD feature supports connection over the default network
> > > > > only. This Patchset series is a Qemu side implementation of providing multiple
> > > > > interfaces support for multi-FD. This enables us to fully utilize dedicated or
> > > > > multiple NICs in case bonding of NICs is not possible.
> > > > > 
> > > > > 
> > > > > Introduction
> > > > > -------------
> > > > > Multi-FD Qemu implementation currently supports connection only on the default
> > > > > network. This forbids us from advantages like:
> > > > > - Separating VM live migration traffic from the default network.
> > > 
> > > Hi Daniel,
> > > 
> > > I totally understand your concern around this approach increasing compexity inside qemu,
> > > 
> > > when similar things can be done with NIC teaming. But we thought this approach provides
> > > 
> > > much more flexibility to user in few cases like.
> > > 
> > > 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
> > > 
> > >     in their setups was very rare. So for those cases this approach can help in utilise multiple
> > > 
> > >     NICs as teaming is not possible there.
> > 
> > AFAIK,  LACP is not required in order to do link aggregation with Linux.
> > Traditional Linux bonding has no special NIC hardware or switch requirements,
> > so LACP is merely a "nice to have" in order to simplify some aspects.
> > 
> > IOW, migration with traffic spread across multiple NICs is already
> > possible AFAICT.
> > 
> > I can understand that some people may not have actually configured
> > bonding on their hosts, but it is not unreasonable to request that
> > they do so, if they want to take advantage fo aggrated bandwidth.
> > 
> > It has the further benefit that it will be fault tolerant. With
> > this proposal if any single NIC has a problem, the whole migration
> > will get stuck. With kernel level bonding, if any single NIC haus
> > a problem, it'll get offlined by the kernel and migration will
> > continue to  work across remaining active NICs.
> > 
> > > 2. We have seen requests recently to separate out traffic of storage, VM netwrok, migration
> > > 
> > >     over different vswitch which can be backed by 1 or more NICs as this give better
> > > 
> > >     predictability and assurance. So host with multiple ips/vswitches can be very common
> > > 
> > >     environment. In this kind of enviroment this approach gives per vm or migration level
> > > 
> > >     flexibilty, like for critical VM we can still use bandwidth from all available vswitch/interface
> > > 
> > >     but for normal VM they can keep live migration only on dedicated NICs without changing
> > > 
> > >     complete host network topology.
> > > 
> > >     At final we want it to be something like this [<ip-pair>, <multiFD-channels>, <bandwidth_control>]
> > > 
> > >     to provide bandwidth_control per interface.
> > 
> > Again, it is already possible to separate migration traffic from storage
> > traffic, from other network traffic. The target IP given will influence
> > which NIC is used based on routing table and I know this is already
> > done widely with OpenStack deployments.
> 
> Actually I should clarify this is only practical if the two NICs are
> using different IP subnets, otherwise routing rules are not viable.
> So needing to set source IP would be needed to select between a pair
> of NICs on the same IP subnet.

Yeh so I think that's one reason that the idea in this series is OK
(together with the idea for the NUMA stuff) and I suspect there are
other cases as well.

Dave

> Previous usage I've seen has always setup fully distinct IP subnets
> for generic vs storage vs migration network traffic.
> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by manish.mishra 1 year, 11 months ago

On 16/06/22 9:20 pm, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>> On Wed, Jun 15, 2022 at 05:43:28PM +0100, Daniel P. Berrangé wrote:
>>> On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
>>>> On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
>>>>> On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
>>>>>> As of now, the multi-FD feature supports connection over the default network
>>>>>> only. This Patchset series is a Qemu side implementation of providing multiple
>>>>>> interfaces support for multi-FD. This enables us to fully utilize dedicated or
>>>>>> multiple NICs in case bonding of NICs is not possible.
>>>>>>
>>>>>>
>>>>>> Introduction
>>>>>> -------------
>>>>>> Multi-FD Qemu implementation currently supports connection only on the default
>>>>>> network. This forbids us from advantages like:
>>>>>> - Separating VM live migration traffic from the default network.
>>>> Hi Daniel,
>>>>
>>>> I totally understand your concern around this approach increasing compexity inside qemu,
>>>>
>>>> when similar things can be done with NIC teaming. But we thought this approach provides
>>>>
>>>> much more flexibility to user in few cases like.
>>>>
>>>> 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
>>>>
>>>>      in their setups was very rare. So for those cases this approach can help in utilise multiple
>>>>
>>>>      NICs as teaming is not possible there.
>>> AFAIK,  LACP is not required in order to do link aggregation with Linux.
>>> Traditional Linux bonding has no special NIC hardware or switch requirements,
>>> so LACP is merely a "nice to have" in order to simplify some aspects.
>>>
>>> IOW, migration with traffic spread across multiple NICs is already
>>> possible AFAICT.
>>>
>>> I can understand that some people may not have actually configured
>>> bonding on their hosts, but it is not unreasonable to request that
>>> they do so, if they want to take advantage fo aggrated bandwidth.
>>>
>>> It has the further benefit that it will be fault tolerant. With
>>> this proposal if any single NIC has a problem, the whole migration
>>> will get stuck. With kernel level bonding, if any single NIC haus
>>> a problem, it'll get offlined by the kernel and migration will
>>> continue to  work across remaining active NICs.
>>>
>>>> 2. We have seen requests recently to separate out traffic of storage, VM netwrok, migration
>>>>
>>>>      over different vswitch which can be backed by 1 or more NICs as this give better
>>>>
>>>>      predictability and assurance. So host with multiple ips/vswitches can be very common
>>>>
>>>>      environment. In this kind of enviroment this approach gives per vm or migration level
>>>>
>>>>      flexibilty, like for critical VM we can still use bandwidth from all available vswitch/interface
>>>>
>>>>      but for normal VM they can keep live migration only on dedicated NICs without changing
>>>>
>>>>      complete host network topology.
>>>>
>>>>      At final we want it to be something like this [<ip-pair>, <multiFD-channels>, <bandwidth_control>]
>>>>
>>>>      to provide bandwidth_control per interface.
>>> Again, it is already possible to separate migration traffic from storage
>>> traffic, from other network traffic. The target IP given will influence
>>> which NIC is used based on routing table and I know this is already
>>> done widely with OpenStack deployments.
>> Actually I should clarify this is only practical if the two NICs are
>> using different IP subnets, otherwise routing rules are not viable.
>> So needing to set source IP would be needed to select between a pair
>> of NICs on the same IP subnet.
> Yeh so I think that's one reason that the idea in this series is OK
> (together with the idea for the NUMA stuff) and I suspect there are
> other cases as well.
>
> Dave
>
yes, David multiFD per NUMA seems interesting idea, I was just curious

how much throughput diff we can experience per multiFD channel

with local vs remote NIC?

thanks

Manish Mishra

>> Previous usage I've seen has always setup fully distinct IP subnets
>> for generic vs storage vs migration network traffic.
>>
>> With regards,
>> Daniel
>> -- 
>> |: https://urldefense.proofpoint.com/v2/url?u=https-3A__berrange.com&d=DwIDAw&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=qfclRDP-GXttuWQ3knJS2RHXmg2XjmG7Pju002cBrHugZE8hpO3DRbKdHphItFr-&s=1RKIz6cO82_JwgkJ-QLP3SRWaG2Lo6J8w4O0Z2YVJ4Q&e=       -o-    https://urldefense.proofpoint.com/v2/url?u=https-3A__www.flickr.com_photos_dberrange&d=DwIDAw&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=qfclRDP-GXttuWQ3knJS2RHXmg2XjmG7Pju002cBrHugZE8hpO3DRbKdHphItFr-&s=BkGiCLXloxlYYBJeJ_0XGRUgkUraRPJdIu26ukR6erI&e=  :|
>> |: https://urldefense.proofpoint.com/v2/url?u=https-3A__libvirt.org&d=DwIDAw&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=qfclRDP-GXttuWQ3knJS2RHXmg2XjmG7Pju002cBrHugZE8hpO3DRbKdHphItFr-&s=KOz_zQXuQzFxwhNLINm-FrADPcBgnVjjmULmZ6iZTi4&e=          -o-            https://urldefense.proofpoint.com/v2/url?u=https-3A__fstop138.berrange.com&d=DwIDAw&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=qfclRDP-GXttuWQ3knJS2RHXmg2XjmG7Pju002cBrHugZE8hpO3DRbKdHphItFr-&s=Ez_j93m7dz0aJe9mjyynk8mJ122ZeXre2F-ylFXj2og&e=  :|
>> |: https://urldefense.proofpoint.com/v2/url?u=https-3A__entangle-2Dphoto.org&d=DwIDAw&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=qfclRDP-GXttuWQ3knJS2RHXmg2XjmG7Pju002cBrHugZE8hpO3DRbKdHphItFr-&s=ID9hDsAkt6zO_o85XqDIjhoxLiwrOhyfAhEqJSukAbw&e=     -o-    https://urldefense.proofpoint.com/v2/url?u=https-3A__www.instagram.com_dberrange&d=DwIDAw&c=s883GpUCOChKOHiocYtGcg&r=c4KON2DiMd-szjwjggQcuUvTsPWblztAL0gVzaHnNmc&m=qfclRDP-GXttuWQ3knJS2RHXmg2XjmG7Pju002cBrHugZE8hpO3DRbKdHphItFr-&s=EK5bGerh1gLCXnMyUV1FlC8EyMN2lWa-r1MVxp6_A_s&e=  :|
>>

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Dr. David Alan Gilbert 1 year, 11 months ago

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
> > 
> > On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> > > On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> > > > As of now, the multi-FD feature supports connection over the default network
> > > > only. This Patchset series is a Qemu side implementation of providing multiple
> > > > interfaces support for multi-FD. This enables us to fully utilize dedicated or
> > > > multiple NICs in case bonding of NICs is not possible.
> > > > 
> > > > 
> > > > Introduction
> > > > -------------
> > > > Multi-FD Qemu implementation currently supports connection only on the default
> > > > network. This forbids us from advantages like:
> > > > - Separating VM live migration traffic from the default network.
> > 
> > Hi Daniel,
> > 
> > I totally understand your concern around this approach increasing compexity inside qemu,
> > 
> > when similar things can be done with NIC teaming. But we thought this approach provides
> > 
> > much more flexibility to user in few cases like.
> > 
> > 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
> > 
> >     in their setups was very rare. So for those cases this approach can help in utilise multiple
> > 
> >     NICs as teaming is not possible there.
> 
> AFAIK,  LACP is not required in order to do link aggregation with Linux.
> Traditional Linux bonding has no special NIC hardware or switch requirements,
> so LACP is merely a "nice to have" in order to simplify some aspects.
> 
> IOW, migration with traffic spread across multiple NICs is already
> possible AFAICT.

Are we sure that works with multifd?  I've seen a lot of bonding NIC
setups which spread based on a hash of source/destination IP and port
numbers; given that we use the same dest port and IP at the moment what
happens in reality?  That hashing can be quite delicate for high
bandwidth single streams.

> I can understand that some people may not have actually configured
> bonding on their hosts, but it is not unreasonable to request that
> they do so, if they want to take advantage fo aggrated bandwidth.
> 
> It has the further benefit that it will be fault tolerant. With
> this proposal if any single NIC has a problem, the whole migration
> will get stuck. With kernel level bonding, if any single NIC haus
> a problem, it'll get offlined by the kernel and migration will
> continue to  work across remaining active NICs.
> 
> > 2. We have seen requests recently to separate out traffic of storage, VM netwrok, migration
> > 
> >     over different vswitch which can be backed by 1 or more NICs as this give better
> > 
> >     predictability and assurance. So host with multiple ips/vswitches can be very common
> > 
> >     environment. In this kind of enviroment this approach gives per vm or migration level
> > 
> >     flexibilty, like for critical VM we can still use bandwidth from all available vswitch/interface
> > 
> >     but for normal VM they can keep live migration only on dedicated NICs without changing
> > 
> >     complete host network topology.
> > 
> >     At final we want it to be something like this [<ip-pair>, <multiFD-channels>, <bandwidth_control>]
> > 
> >     to provide bandwidth_control per interface.
> 
> Again, it is already possible to separate migration traffic from storage
> traffic, from other network traffic. The target IP given will influence
> which NIC is used based on routing table and I know this is already
> done widely with OpenStack deployments.
> 
> > 3. Dedicated NIC we mentioned as a use case, agree with you it can be done without this
> > 
> >     approach too.
> 
> 
> > > > Multi-interface with Multi-FD
> > > > -----------------------------
> > > > Multiple-interface support over basic multi-FD has been implemented in the
> > > > patches. Advantages of this implementation are:
> > > > - Able to separate live migration traffic from default network interface by
> > > >    creating multiFD channels on ip addresses of multiple non-default interfaces.
> > > > - Can optimize the number of multi-FD channels on a particular interface
> > > >    depending upon the network bandwidth limit on a particular interface.
> > > Manually assigning individual channels to different NICs is a pretty
> > > inefficient way to optimizing traffic. Feels like you could easily get
> > > into a situation where one NIC ends up idle while the other is busy,
> > > especially if the traffic patterns are different. For example with
> > > post-copy there's an extra channel for OOB async page requests, and
> > > its far from clear that manually picking NICs per chanel upfront is
> > > going work for that.  The kernel can continually dynamically balance
> > > load on the fly and so do much better than any static mapping QEMU
> > > tries to apply, especially if there are multiple distinct QEMU's
> > > competing for bandwidth.
> > > 
> > Yes, Daniel current solution is only for pre-copy. As with postcopy
> > multiFD is not yet supported but in future we can extend it for postcopy

I had been thinking about explicit selection of network device for NUMA
use though; ideally I'd like to be able to associate a set of multifd
threads to each NUMA node, and then associate a NIC with that set of
threads; so that the migration happens down the NIC that's on the node
the RAM is on.  On a really good day you'd have one NIC per top level
NUMA node.

> > channels too.
> > 
> > > > Implementation
> > > > --------------
> > > > 
> > > > Earlier the 'migrate' qmp command:
> > > > { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } }
> > > > 
> > > > Modified qmp command:
> > > > { "execute": "migrate",
> > > >               "arguments": { "uri": "tcp:0:4446", "multi-fd-uri-list": [ {
> > > >               "source-uri": "tcp::6900", "destination-uri": "tcp:0:4480",
> > > >               "multifd-channels": 4}, { "source-uri": "tcp:10.0.0.0: ",
> > > >               "destination-uri": "tcp:11.0.0.0:7789",
> > > >               "multifd-channels": 5} ] } }
> > > > ------------------------------------------------------------------------------
> > > > 
> > > > Earlier the 'migrate-incoming' qmp command:
> > > > { "execute": "migrate-incoming", "arguments": { "uri": "tcp::4446" } }
> > > > 
> > > > Modified 'migrate-incoming' qmp command:
> > > > { "execute": "migrate-incoming",
> > > >              "arguments": {"uri": "tcp::6789",
> > > >              "multi-fd-uri-list" : [ {"destination-uri" : "tcp::6900",
> > > >              "multifd-channels": 4}, {"destination-uri" : "tcp:11.0.0.0:7789",
> > > >              "multifd-channels": 5} ] } }
> > > > ------------------------------------------------------------------------------
> > > These examples pretty nicely illustrate my concern with this
> > > proposal. It is making QEMU configuration of migration
> > > massively more complicated, while duplicating functionality
> > > the kernel can provide via NIC teaming, but without having
> > > ability to balance it on the fly as the kernel would.
> > 
> > Yes, agree Daniel this raises complexity but we will make sure that it does not
> > 
> > change/imapct anything existing and we provide new options as optional.
> 
> The added code is certainly going to impact ongoing maint of QEMU I/O
> layer and migration in particular. I'm not convinced this complexity
> is compelling enough compared to leveraging kernel native bonding
> to justify the maint burden it will impose.

Dave

> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Daniel P. Berrangé 1 year, 11 months ago

On Wed, Jun 15, 2022 at 08:14:26PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
> > > 
> > > On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> > > > On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> > > > > As of now, the multi-FD feature supports connection over the default network
> > > > > only. This Patchset series is a Qemu side implementation of providing multiple
> > > > > interfaces support for multi-FD. This enables us to fully utilize dedicated or
> > > > > multiple NICs in case bonding of NICs is not possible.
> > > > > 
> > > > > 
> > > > > Introduction
> > > > > -------------
> > > > > Multi-FD Qemu implementation currently supports connection only on the default
> > > > > network. This forbids us from advantages like:
> > > > > - Separating VM live migration traffic from the default network.
> > > 
> > > Hi Daniel,
> > > 
> > > I totally understand your concern around this approach increasing compexity inside qemu,
> > > 
> > > when similar things can be done with NIC teaming. But we thought this approach provides
> > > 
> > > much more flexibility to user in few cases like.
> > > 
> > > 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
> > > 
> > >     in their setups was very rare. So for those cases this approach can help in utilise multiple
> > > 
> > >     NICs as teaming is not possible there.
> > 
> > AFAIK,  LACP is not required in order to do link aggregation with Linux.
> > Traditional Linux bonding has no special NIC hardware or switch requirements,
> > so LACP is merely a "nice to have" in order to simplify some aspects.
> > 
> > IOW, migration with traffic spread across multiple NICs is already
> > possible AFAICT.
> 
> Are we sure that works with multifd?  I've seen a lot of bonding NIC
> setups which spread based on a hash of source/destination IP and port
> numbers; given that we use the same dest port and IP at the moment what
> happens in reality?  That hashing can be quite delicate for high
> bandwidth single streams.

The simplest Linux bonding mode does per-packet round-robin across 
NICs, so traffic from the collection of multifd connections should
fill up all the NICs in the bond. There are of course other modes
which may be sub-optimal for the reasons you describe. Which mode
to pick depends on the type of service traffic patterns you're
aiming to balance.

> > > > > Multi-interface with Multi-FD
> > > > > -----------------------------
> > > > > Multiple-interface support over basic multi-FD has been implemented in the
> > > > > patches. Advantages of this implementation are:
> > > > > - Able to separate live migration traffic from default network interface by
> > > > >    creating multiFD channels on ip addresses of multiple non-default interfaces.
> > > > > - Can optimize the number of multi-FD channels on a particular interface
> > > > >    depending upon the network bandwidth limit on a particular interface.
> > > > Manually assigning individual channels to different NICs is a pretty
> > > > inefficient way to optimizing traffic. Feels like you could easily get
> > > > into a situation where one NIC ends up idle while the other is busy,
> > > > especially if the traffic patterns are different. For example with
> > > > post-copy there's an extra channel for OOB async page requests, and
> > > > its far from clear that manually picking NICs per chanel upfront is
> > > > going work for that.  The kernel can continually dynamically balance
> > > > load on the fly and so do much better than any static mapping QEMU
> > > > tries to apply, especially if there are multiple distinct QEMU's
> > > > competing for bandwidth.
> > > > 
> > > Yes, Daniel current solution is only for pre-copy. As with postcopy
> > > multiFD is not yet supported but in future we can extend it for postcopy
> 
> I had been thinking about explicit selection of network device for NUMA
> use though; ideally I'd like to be able to associate a set of multifd
> threads to each NUMA node, and then associate a NIC with that set of
> threads; so that the migration happens down the NIC that's on the node
> the RAM is on.  On a really good day you'd have one NIC per top level
> NUMA node.

Now that's an interesting idea, and not one that can be dealt with
by bonding, since the network layer won't be aware of the NUMA
affinity constraints.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by manish.mishra 1 year, 11 months ago

On 16/06/22 1:46 pm, Daniel P. Berrangé wrote:
> On Wed, Jun 15, 2022 at 08:14:26PM +0100, Dr. David Alan Gilbert wrote:
>> * Daniel P. Berrangé (berrange@redhat.com) wrote:
>>> On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
>>>> On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
>>>>> On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
>>>>>> As of now, the multi-FD feature supports connection over the default network
>>>>>> only. This Patchset series is a Qemu side implementation of providing multiple
>>>>>> interfaces support for multi-FD. This enables us to fully utilize dedicated or
>>>>>> multiple NICs in case bonding of NICs is not possible.
>>>>>>
>>>>>>
>>>>>> Introduction
>>>>>> -------------
>>>>>> Multi-FD Qemu implementation currently supports connection only on the default
>>>>>> network. This forbids us from advantages like:
>>>>>> - Separating VM live migration traffic from the default network.
>>>> Hi Daniel,
>>>>
>>>> I totally understand your concern around this approach increasing compexity inside qemu,
>>>>
>>>> when similar things can be done with NIC teaming. But we thought this approach provides
>>>>
>>>> much more flexibility to user in few cases like.
>>>>
>>>> 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
>>>>
>>>>      in their setups was very rare. So for those cases this approach can help in utilise multiple
>>>>
>>>>      NICs as teaming is not possible there.
>>> AFAIK,  LACP is not required in order to do link aggregation with Linux.
>>> Traditional Linux bonding has no special NIC hardware or switch requirements,
>>> so LACP is merely a "nice to have" in order to simplify some aspects.
>>>
>>> IOW, migration with traffic spread across multiple NICs is already
>>> possible AFAICT.
>> Are we sure that works with multifd?  I've seen a lot of bonding NIC
>> setups which spread based on a hash of source/destination IP and port
>> numbers; given that we use the same dest port and IP at the moment what
>> happens in reality?  That hashing can be quite delicate for high
>> bandwidth single streams.
> The simplest Linux bonding mode does per-packet round-robin across
> NICs, so traffic from the collection of multifd connections should
> fill up all the NICs in the bond. There are of course other modes
> which may be sub-optimal for the reasons you describe. Which mode
> to pick depends on the type of service traffic patterns you're
> aiming to balance.

My understanding on networking is not good enough so apologies in advance if something

does not make sense. As per my understanding it is easy to do load balancing on sender

side because we have full control where to send packet but complicated on receive side

if we do not have LACP like support. I see there are some teaming technique which does

load balancing of incoming traffic by possibly sending different slaves mac address on arp

requests but that does not work for our use case and may require a complicated setup

for proper usage. Our use case can be something like this e.g. both source and destination

has 2-2 NICs of 10Gbps each and we want to get a throughput of 20Gbps for live migration.

thanks

Manish Mishra

>
>>>>>> Multi-interface with Multi-FD
>>>>>> -----------------------------
>>>>>> Multiple-interface support over basic multi-FD has been implemented in the
>>>>>> patches. Advantages of this implementation are:
>>>>>> - Able to separate live migration traffic from default network interface by
>>>>>>     creating multiFD channels on ip addresses of multiple non-default interfaces.
>>>>>> - Can optimize the number of multi-FD channels on a particular interface
>>>>>>     depending upon the network bandwidth limit on a particular interface.
>>>>> Manually assigning individual channels to different NICs is a pretty
>>>>> inefficient way to optimizing traffic. Feels like you could easily get
>>>>> into a situation where one NIC ends up idle while the other is busy,
>>>>> especially if the traffic patterns are different. For example with
>>>>> post-copy there's an extra channel for OOB async page requests, and
>>>>> its far from clear that manually picking NICs per chanel upfront is
>>>>> going work for that.  The kernel can continually dynamically balance
>>>>> load on the fly and so do much better than any static mapping QEMU
>>>>> tries to apply, especially if there are multiple distinct QEMU's
>>>>> competing for bandwidth.
>>>>>
>>>> Yes, Daniel current solution is only for pre-copy. As with postcopy
>>>> multiFD is not yet supported but in future we can extend it for postcopy
>> I had been thinking about explicit selection of network device for NUMA
>> use though; ideally I'd like to be able to associate a set of multifd
>> threads to each NUMA node, and then associate a NIC with that set of
>> threads; so that the migration happens down the NIC that's on the node
>> the RAM is on.  On a really good day you'd have one NIC per top level
>> NUMA node.
> Now that's an interesting idea, and not one that can be dealt with
> by bonding, since the network layer won't be aware of the NUMA
> affinity constraints.
>
>
> With regards,
> Daniel

Re: [PATCH 0/4] Multiple interface support on top of Multi-FD

Posted by Daniel P. Berrangé 1 year, 11 months ago

On Thu, Jun 16, 2022 at 03:44:09PM +0530, manish.mishra wrote:
> 
> On 16/06/22 1:46 pm, Daniel P. Berrangé wrote:
> > On Wed, Jun 15, 2022 at 08:14:26PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > On Fri, Jun 10, 2022 at 05:58:31PM +0530, manish.mishra wrote:
> > > > > On 09/06/22 9:17 pm, Daniel P. Berrangé wrote:
> > > > > > On Thu, Jun 09, 2022 at 07:33:01AM +0000, Het Gala wrote:
> > > > > > > As of now, the multi-FD feature supports connection over the default network
> > > > > > > only. This Patchset series is a Qemu side implementation of providing multiple
> > > > > > > interfaces support for multi-FD. This enables us to fully utilize dedicated or
> > > > > > > multiple NICs in case bonding of NICs is not possible.
> > > > > > > 
> > > > > > > 
> > > > > > > Introduction
> > > > > > > -------------
> > > > > > > Multi-FD Qemu implementation currently supports connection only on the default
> > > > > > > network. This forbids us from advantages like:
> > > > > > > - Separating VM live migration traffic from the default network.
> > > > > Hi Daniel,
> > > > > 
> > > > > I totally understand your concern around this approach increasing compexity inside qemu,
> > > > > 
> > > > > when similar things can be done with NIC teaming. But we thought this approach provides
> > > > > 
> > > > > much more flexibility to user in few cases like.
> > > > > 
> > > > > 1. We checked our customer data, almost all of the host had multiple NIC, but LACP support
> > > > > 
> > > > >      in their setups was very rare. So for those cases this approach can help in utilise multiple
> > > > > 
> > > > >      NICs as teaming is not possible there.
> > > > AFAIK,  LACP is not required in order to do link aggregation with Linux.
> > > > Traditional Linux bonding has no special NIC hardware or switch requirements,
> > > > so LACP is merely a "nice to have" in order to simplify some aspects.
> > > > 
> > > > IOW, migration with traffic spread across multiple NICs is already
> > > > possible AFAICT.
> > > Are we sure that works with multifd?  I've seen a lot of bonding NIC
> > > setups which spread based on a hash of source/destination IP and port
> > > numbers; given that we use the same dest port and IP at the moment what
> > > happens in reality?  That hashing can be quite delicate for high
> > > bandwidth single streams.
> > The simplest Linux bonding mode does per-packet round-robin across
> > NICs, so traffic from the collection of multifd connections should
> > fill up all the NICs in the bond. There are of course other modes
> > which may be sub-optimal for the reasons you describe. Which mode
> > to pick depends on the type of service traffic patterns you're
> > aiming to balance.
> 
> My understanding on networking is not good enough so apologies in advance if something
> does not make sense. As per my understanding it is easy to do load balancing on sender
> side because we have full control where to send packet but complicated on receive side
> if we do not have LACP like support. I see there are some teaming technique which does
> load balancing of incoming traffic by possibly sending different slaves mac address on arp
> requests but that does not work for our use case and may require a complicated setup
> for proper usage. Our use case can be something like this e.g. both source and destination
> has 2-2 NICs of 10Gbps each and we want to get a throughput of 20Gbps for live migration.

I believe you are right. The Linux bonding will give us full 20 Gpbs
throughput on the transmit side, without any hardware dependancies.
On the receive side, however, there is a dependancy on the network
switch to be able to balance the traffic it forwards to the target.
This is fairly common in switches, but the typical policies based on
hashing the MAC/IP addr will not be sufficient in this case. 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|