[libvirt] [v4 0/9] Support cache tune in libvirt

Eli Qiao posted 9 patches 7 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/1486708933-5731-1-git-send-email-liyong.qiao@intel.com
There is a newer version of this series
docs/schemas/domaincommon.rng |   46 ++
include/libvirt/virterror.h   |    1 +
po/POTFILES.in                |    1 +
src/Makefile.am               |    1 +
src/conf/capabilities.c       |   56 ++
src/conf/capabilities.h       |   23 +
src/conf/domain_conf.c        |  182 +++++++
src/conf/domain_conf.h        |   19 +
src/libvirt_private.syms      |   11 +
src/nodeinfo.c                |   64 +++
src/nodeinfo.h                |    1 +
src/qemu/qemu_capabilities.c  |    8 +
src/qemu/qemu_driver.c        |    6 +
src/qemu/qemu_process.c       |   53 ++
src/util/virerror.c           |    1 +
src/util/virresctrl.c         | 1156 +++++++++++++++++++++++++++++++++++++++++
src/util/virresctrl.h         |   88 ++++
17 files changed, 1717 insertions(+)
create mode 100644 src/util/virresctrl.c
create mode 100644 src/util/virresctrl.h
[libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Eli Qiao 7 years, 2 months ago
Addressed comment from v3 -> v4:

Daniel & Marcelo:
       * Added concurrence support

Addressed comment from v2 -> v3:

Daniel:
      * Fixed coding style, passed `make check` and `make syntax-check`

      * Variables renaming and move from header file to c file.

      * For locking/mutex support, no progress.

      There are some discussion from mailing list, but I can not find a better
      way to add locking support without performance impact.

      I'll explain the process and please help to advice what shoud we do.

      VM create:
      1) Get the cache left value on each bank of the host. This should be
         shared amount all VMs.
      2) Calculate the schemata on the bank based on all created resctrl
         domain's schemata
      3) Calculate the default schemata by scaning all domain's schemata.
      4) Flush default schemata to /sys/fs/resctrl/schemata

      VM destroy:
      1) Remove the resctrl domain of that VM
      2) Recalculate default schemata
      3) Flush default schemata to /sys/fs/resctrl/schemata

      The key point is that all VMs shares /sys/fs/resctrl/schemata, and
      when a VM create a resctrl domain, the schemata of that VM depends on
      the default schemata and all other exsited schematas. So a global
      mutex is reqired.

      Before calculate a schemata or update default schemata, libvirt
      should gain this global mutex.

      I will try to think more about how to support this gracefully in next
      patch set.

Marcelo:
      * Added vcpu support for cachetune, this will allow user to define which
        vcpu using which cache allocation bank.

        <cachetune id='0' host_id=0 size='3072' unit='KiB' vcpus='0,1'/>

        vcpus is a cpumap, the vcpu pids will be added to tasks file

      * Added cdp compatible, user can specify l3 cache even host enable cdp.
        See patch 8.
        On a cdp enabled host, specify l3code/l3data by

        <cachetune id='0' host_id='0' type='l3' size='3072' unit='KiB'/>

        This will create a schemata like:
            L3data:0=0xff00;...
            L3code:0=0xff00;...

      * Would you please help to test if the functions work.

Martin:
      * Xml test case, I have no time to work on this yet, would you please
        show me an example, would like to amend it later.

This series patches are for supportting CAT featues, which also
called cache tune in libvirt.

First to expose cache information which could be tuned in capabilites XML.
Then add new domain xml element support to add cacahe bank which will apply
on this libvirt domain.

This series patches add a util file `resctrl.c/h`, an interface to talk with
linux kernel's system fs.

There are still one TODO left:
    1. Expose a new public interface to get free cache information.

Some discussion about this feature support can be found from:
https://www.redhat.com/archives/libvir-list/2017-January/msg00644.html


Eli Qiao (9):
  Resctrl: Add some utils functions
  Resctrl: expose cache information to capabilities
  Resctrl: Add new xml element to support cache tune
  Resctrl: Add private interface to set cachebanks
  Qemu: Set cache banks
  Resctrl: enable l3code/l3data
  Resctrl: Make sure l3data/l3code are pairs
  Resctrl: Compatible mode for cdp enabled
  Resctrl: concurrence support

 docs/schemas/domaincommon.rng |   46 ++
 include/libvirt/virterror.h   |    1 +
 po/POTFILES.in                |    1 +
 src/Makefile.am               |    1 +
 src/conf/capabilities.c       |   56 ++
 src/conf/capabilities.h       |   23 +
 src/conf/domain_conf.c        |  182 +++++++
 src/conf/domain_conf.h        |   19 +
 src/libvirt_private.syms      |   11 +
 src/nodeinfo.c                |   64 +++
 src/nodeinfo.h                |    1 +
 src/qemu/qemu_capabilities.c  |    8 +
 src/qemu/qemu_driver.c        |    6 +
 src/qemu/qemu_process.c       |   53 ++
 src/util/virerror.c           |    1 +
 src/util/virresctrl.c         | 1156 +++++++++++++++++++++++++++++++++++++++++
 src/util/virresctrl.h         |   88 ++++
 17 files changed, 1717 insertions(+)
 create mode 100644 src/util/virresctrl.c
 create mode 100644 src/util/virresctrl.h

--
1.9.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Marcelo Tosatti 7 years, 2 months ago
On Fri, Feb 10, 2017 at 02:42:04PM +0800, Eli Qiao wrote:
> Addressed comment from v3 -> v4:
> 
> Daniel & Marcelo:
>        * Added concurrence support
> 
> Addressed comment from v2 -> v3:
> 
> Daniel:
>       * Fixed coding style, passed `make check` and `make syntax-check`
> 
>       * Variables renaming and move from header file to c file.
> 
>       * For locking/mutex support, no progress.
> 
>       There are some discussion from mailing list, but I can not find a better
>       way to add locking support without performance impact.
> 
>       I'll explain the process and please help to advice what shoud we do.
> 
>       VM create:
>       1) Get the cache left value on each bank of the host. This should be
>          shared amount all VMs.
>       2) Calculate the schemata on the bank based on all created resctrl
>          domain's schemata
>       3) Calculate the default schemata by scaning all domain's schemata.
>       4) Flush default schemata to /sys/fs/resctrl/schemata
> 
>       VM destroy:
>       1) Remove the resctrl domain of that VM
>       2) Recalculate default schemata
>       3) Flush default schemata to /sys/fs/resctrl/schemata
> 
>       The key point is that all VMs shares /sys/fs/resctrl/schemata, and
>       when a VM create a resctrl domain, the schemata of that VM depends on
>       the default schemata and all other exsited schematas. So a global
>       mutex is reqired.
> 
>       Before calculate a schemata or update default schemata, libvirt
>       should gain this global mutex.
> 
>       I will try to think more about how to support this gracefully in next
>       patch set.
> 
> Marcelo:
>       * Added vcpu support for cachetune, this will allow user to define which
>         vcpu using which cache allocation bank.
> 
>         <cachetune id='0' host_id=0 size='3072' unit='KiB' vcpus='0,1'/>
> 
>         vcpus is a cpumap, the vcpu pids will be added to tasks file

Working as expected.

>       * Added cdp compatible, user can specify l3 cache even host enable cdp.
>         See patch 8.
>         On a cdp enabled host, specify l3code/l3data by
> 
>         <cachetune id='0' host_id='0' type='l3' size='3072' unit='KiB'/>
> 
>         This will create a schemata like:
>             L3data:0=0xff00;...
>             L3code:0=0xff00;...
> 
>       * Would you please help to test if the functions work.

Setting up CDP machine.

Unrelated:

Found a bug:

The code should scan for all directories in resctrlfs,
and then find free CBM space from that:


        free_cbm_space = ~(resctrldir1.CBM_bits &
                           resctrldir2.CBM_bits &
                           ...
                           resctrldirN.CBM_bits)

        For all resctrlfs directories.

The bug is as follows:

Create a directory in resctrlfs by hand:

# mkdir newres
# cd newres
# echo "L3:0=3;1=f0000" > schemata
# virsh start guest
# cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
# cat schemata
L3:0=3;1=f0000

That is, it is using the same CBM space as the "newres"
reservation.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Eli Qiao 7 years, 2 months ago

--  
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, 10 February 2017 at 10:19 PM, Marcelo Tosatti wrote:

> On Fri, Feb 10, 2017 at 02:42:04PM +0800, Eli Qiao wrote:
> > Addressed comment from v3 -> v4:
> >  
> > Daniel & Marcelo:
> > * Added concurrence support
> >  
> > Addressed comment from v2 -> v3:
> >  
> > Daniel:
> > * Fixed coding style, passed `make check` and `make syntax-check`
> >  
> > * Variables renaming and move from header file to c file.
> >  
> > * For locking/mutex support, no progress.
> >  
> > There are some discussion from mailing list, but I can not find a better
> > way to add locking support without performance impact.
> >  
> > I'll explain the process and please help to advice what shoud we do.
> >  
> > VM create:
> > 1) Get the cache left value on each bank of the host. This should be
> > shared amount all VMs.
> > 2) Calculate the schemata on the bank based on all created resctrl
> > domain's schemata
> > 3) Calculate the default schemata by scaning all domain's schemata.
> > 4) Flush default schemata to /sys/fs/resctrl/schemata
> >  
> > VM destroy:
> > 1) Remove the resctrl domain of that VM
> > 2) Recalculate default schemata
> > 3) Flush default schemata to /sys/fs/resctrl/schemata
> >  
> > The key point is that all VMs shares /sys/fs/resctrl/schemata, and
> > when a VM create a resctrl domain, the schemata of that VM depends on
> > the default schemata and all other exsited schematas. So a global
> > mutex is reqired.
> >  
> > Before calculate a schemata or update default schemata, libvirt
> > should gain this global mutex.
> >  
> > I will try to think more about how to support this gracefully in next
> > patch set.
> >  
> > Marcelo:
> > * Added vcpu support for cachetune, this will allow user to define which
> > vcpu using which cache allocation bank.
> >  
> > <cachetune id='0' host_id=0 size='3072' unit='KiB' vcpus='0,1'/>
> >  
> > vcpus is a cpumap, the vcpu pids will be added to tasks file
>  
> Working as expected.
>  
> > * Added cdp compatible, user can specify l3 cache even host enable cdp.
> > See patch 8.
> > On a cdp enabled host, specify l3code/l3data by
> >  
> > <cachetune id='0' host_id='0' type='l3' size='3072' unit='KiB'/>
> >  
> > This will create a schemata like:
> > L3data:0=0xff00;...
> > L3code:0=0xff00;...
> >  
> > * Would you please help to test if the functions work.
>  
> Setting up CDP machine.
>  
> Unrelated:
>  
> Found a bug:
>  
> The code should scan for all directories in resctrlfs,
> and then find free CBM space from that:
>  
>  
> free_cbm_space = ~(resctrldir1.CBM_bits &
> resctrldir2.CBM_bits &
> ...
> resctrldirN.CBM_bits)
>  
> For all resctrlfs directories.
>  
> The bug is as follows:
>  
> Create a directory in resctrlfs by hand:
>  
> # mkdir newres

Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.

we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.  

  
> # cd newres
> # echo "L3:0=3;1=f0000" > schemata
> # virsh start guest
> # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
> # cat schemata
> L3:0=3;1=f0000
>  
> That is, it is using the same CBM space as the "newres"
> reservation.
>  
>  


As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.

That will lead mess, this should be forbidden.  

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Marcelo Tosatti 7 years, 2 months ago
On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
> > > L3data:0=0xff00;...
> > > L3code:0=0xff00;...
> > >  
> > > * Would you please help to test if the functions work.
> >  
> > Setting up CDP machine.
> >  
> > Unrelated:
> >  
> > Found a bug:
> >  
> > The code should scan for all directories in resctrlfs,
> > and then find free CBM space from that:
> >  
> >  
> > free_cbm_space = ~(resctrldir1.CBM_bits &
> > resctrldir2.CBM_bits &
> > ...
> > resctrldirN.CBM_bits)
> >  
> > For all resctrlfs directories.
> >  
> > The bug is as follows:
> >  
> > Create a directory in resctrlfs by hand:
> >  
> > # mkdir newres
> 
> Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.

Are you saying that usage of the resctrlfs filesystem should not be
allowed after libvirt starts? I don't think this is correct.

> we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.  



> 
>   
> > # cd newres
> > # echo "L3:0=3;1=f0000" > schemata
> > # virsh start guest
> > # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
> > # cat schemata
> > L3:0=3;1=f0000
> >  
> > That is, it is using the same CBM space as the "newres"
> > reservation.
> >  
> >  
> 
> 
> As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.
> 
> That will lead mess, this should be forbidden.  

Well, this means that only libvirt can use the resctrlfs filesystem.

This forbids other applications which require allocation
of CAT resources from being used. 

Its simple to fix it: move the scanning of resctrlfs data from libvirt
initialization time to:

	- VM initialization time
	and
	- virConnectGetCapabilities time

The second case is necessary to get updated free space information.



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Martin Kletzander 7 years, 2 months ago
On Mon, Feb 13, 2017 at 04:09:13PM -0200, Marcelo Tosatti wrote:
>On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
>> > > L3data:0=0xff00;...
>> > > L3code:0=0xff00;...
>> > >
>> > > * Would you please help to test if the functions work.
>> >
>> > Setting up CDP machine.
>> >
>> > Unrelated:
>> >
>> > Found a bug:
>> >
>> > The code should scan for all directories in resctrlfs,
>> > and then find free CBM space from that:
>> >
>> >
>> > free_cbm_space = ~(resctrldir1.CBM_bits &
>> > resctrldir2.CBM_bits &
>> > ...
>> > resctrldirN.CBM_bits)
>> >
>> > For all resctrlfs directories.
>> >
>> > The bug is as follows:
>> >
>> > Create a directory in resctrlfs by hand:
>> >
>> > # mkdir newres
>>
>> Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.
>
>Are you saying that usage of the resctrlfs filesystem should not be
>allowed after libvirt starts? I don't think this is correct.
>

In some cases the only way to accomplish keeping consistency would be
saying that if you are using libvirt, you should not change anything
behind its back.  It would be really wrong in this case, mainly when you
need to set the allocation for other apps, especially when there is
nicely designed file that you can lock.

>> we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.
>

Definitely not.  What if someone wants to create another allocation?
They start by creating a directory, then libvirtd removes it before they
manage to add anything to tasks.

>
>
>>
>>
>> > # cd newres
>> > # echo "L3:0=3;1=f0000" > schemata
>> > # virsh start guest
>> > # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
>> > # cat schemata
>> > L3:0=3;1=f0000
>> >
>> > That is, it is using the same CBM space as the "newres"
>> > reservation.
>> >
>> >
>>
>>
>> As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.
>>
>> That will lead mess, this should be forbidden.
>
>Well, this means that only libvirt can use the resctrlfs filesystem.
>
>This forbids other applications which require allocation
>of CAT resources from being used.
>
>Its simple to fix it: move the scanning of resctrlfs data from libvirt
>initialization time to:
>
>	- VM initialization time
>	and
>	- virConnectGetCapabilities time
>
>The second case is necessary to get updated free space information.
>

Just VM initialization time could be enough as virConnectGetCapabilities
would just know the total and free size would be reported in an API (if
I rememer the discussion correctly)

Martin

>
>
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Eli Qiao 7 years, 2 months ago

--  
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, 14 February 2017 at 4:37 PM, Martin Kletzander wrote:

> On Mon, Feb 13, 2017 at 04:09:13PM -0200, Marcelo Tosatti wrote:
> > On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
> > > > > L3data:0=0xff00;...
> > > > > L3code:0=0xff00;...
> > > > >  
> > > > > * Would you please help to test if the functions work.
> > > >  
> > > > Setting up CDP machine.
> > > >  
> > > > Unrelated:
> > > >  
> > > > Found a bug:
> > > >  
> > > > The code should scan for all directories in resctrlfs,
> > > > and then find free CBM space from that:
> > > >  
> > > >  
> > > > free_cbm_space = ~(resctrldir1.CBM_bits &
> > > > resctrldir2.CBM_bits &
> > > > ...
> > > > resctrldirN.CBM_bits)
> > > >  
> > > > For all resctrlfs directories.
> > > >  
> > > > The bug is as follows:
> > > >  
> > > > Create a directory in resctrlfs by hand:
> > > >  
> > > > # mkdir newres
> > >  
> > > Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.
> >  
> > Are you saying that usage of the resctrlfs filesystem should not be
> > allowed after libvirt starts? I don't think this is correct.
> >  
>  
>  
> In some cases the only way to accomplish keeping consistency would be
> saying that if you are using libvirt, you should not change anything
> behind its back. It would be really wrong in this case, mainly when you
> need to set the allocation for other apps, especially when there is
> nicely designed file that you can lock.
>  

That will mess up the result if allow other app to operate /sys/fs/resctrl, we consume cache and then change the default schemata, if other apps
create a schemata which has overlap with our existed one, the allocation will fail. which is saying the allocated cache will be shared between VMs and other apps.

My concern is that if some app create a new schemata, by default it’s fffff, in our calculate process, the default schemata will be 00001 (kernel don’t allow us for 0), which indicate no cache left …
libvirtd will fail to boot VM.

I really don’t want to let user and libvirtd operate /sys/fs/resctrl at same time.
  
>  
> > > we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.
> >  
>  
>  
> Definitely not. What if someone wants to create another allocation?
> They start by creating a directory, then libvirtd removes it before they
> manage to add anything to tasks.
>  
>  

libvirtd try to scan them which has no tasks to keep things cleanup, think about that some VMs may fail due to unexpected reason, there task ids are lost and kernel remove them from resctrl.
this causes cache leak.
  
>  
> >  
> >  
> > >  
> > >  
> > > > # cd newres
> > > > # echo "L3:0=3;1=f0000" > schemata
> > > > # virsh start guest
> > > > # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
> > > > # cat schemata
> > > > L3:0=3;1=f0000
> > > >  
> > > > That is, it is using the same CBM space as the "newres"
> > > > reservation.
> > > >  
> > >  
> > >  
> > >  
> > > As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.
> > >  
> > > That will lead mess, this should be forbidden.
> >  
> > Well, this means that only libvirt can use the resctrlfs filesystem.
> >  
> > This forbids other applications which require allocation
> > of CAT resources from being used.
> >  
> > Its simple to fix it: move the scanning of resctrlfs data from libvirt
> > initialization time to:
> >  
> > - VM initialization time
> > and
> > - virConnectGetCapabilities time
> >  
> > The second case is necessary to get updated free space information.
>  
> Just VM initialization time could be enough as virConnectGetCapabilities
> would just know the total and free size would be reported in an API (if
> I rememer the discussion correctly)
>  
>  


By doing this, it will slowdown the process of booting a VM if we scan it every time...  
>  
> Martin
>  

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Marcelo Tosatti 7 years, 2 months ago
On Tue, Feb 14, 2017 at 04:59:52PM +0800, Eli Qiao wrote:
> 
> 
> --  
> Eli Qiao
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> 
> 
> On Tuesday, 14 February 2017 at 4:37 PM, Martin Kletzander wrote:
> 
> > On Mon, Feb 13, 2017 at 04:09:13PM -0200, Marcelo Tosatti wrote:
> > > On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
> > > > > > L3data:0=0xff00;...
> > > > > > L3code:0=0xff00;...
> > > > > >  
> > > > > > * Would you please help to test if the functions work.
> > > > >  
> > > > > Setting up CDP machine.
> > > > >  
> > > > > Unrelated:
> > > > >  
> > > > > Found a bug:
> > > > >  
> > > > > The code should scan for all directories in resctrlfs,
> > > > > and then find free CBM space from that:
> > > > >  
> > > > >  
> > > > > free_cbm_space = ~(resctrldir1.CBM_bits &
> > > > > resctrldir2.CBM_bits &
> > > > > ...
> > > > > resctrldirN.CBM_bits)
> > > > >  
> > > > > For all resctrlfs directories.
> > > > >  
> > > > > The bug is as follows:
> > > > >  
> > > > > Create a directory in resctrlfs by hand:
> > > > >  
> > > > > # mkdir newres
> > > >  
> > > > Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.
> > >  
> > > Are you saying that usage of the resctrlfs filesystem should not be
> > > allowed after libvirt starts? I don't think this is correct.
> > >  
> >  
> >  
> > In some cases the only way to accomplish keeping consistency would be
> > saying that if you are using libvirt, you should not change anything
> > behind its back. It would be really wrong in this case, mainly when you
> > need to set the allocation for other apps, especially when there is
> > nicely designed file that you can lock.
> >  
> 
> That will mess up the result if allow other app to operate /sys/fs/resctrl, we consume cache and then change the default schemata, if other apps
> create a schemata which has overlap with our existed one, the allocation will fail. which is saying the allocated cache will be shared between VMs and other apps.

This is why the other app and libvirt should use the filesystem lock. 

> My concern is that if some app create a new schemata, by default it’s fffff, in our calculate process, the default schemata will be 00001 (kernel don’t allow us for 0), which indicate no cache left …
> libvirtd will fail to boot VM.

If both libvirtd and all apps use the filesystem lock, and do not use
CBM space from other CAT reservations, then there will be no problem.

> I really don’t want to let user and libvirtd operate /sys/fs/resctrl at same time.

This is necessary for the NFV usecase. Example: 

OVS-DPDK receiving packets from network card 
and forwarding them to VMs.

Both OVS-DPDK and the VMs use the resctrl filesystem.


> > > > we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.
> > >  
> >  
> >  
> > Definitely not. What if someone wants to create another allocation?
> > They start by creating a directory, then libvirtd removes it before they
> > manage to add anything to tasks.
> >  
> >  
> 
> libvirtd try to scan them which has no tasks to keep things cleanup, think about that some VMs may fail due to unexpected reason, there task ids are lost and kernel remove them from resctrl.
> this causes cache leak.
>   
> >  
> > >  
> > >  
> > > >  
> > > >  
> > > > > # cd newres
> > > > > # echo "L3:0=3;1=f0000" > schemata
> > > > > # virsh start guest
> > > > > # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
> > > > > # cat schemata
> > > > > L3:0=3;1=f0000
> > > > >  
> > > > > That is, it is using the same CBM space as the "newres"
> > > > > reservation.
> > > > >  
> > > >  
> > > >  
> > > >  
> > > > As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.
> > > >  
> > > > That will lead mess, this should be forbidden.
> > >  
> > > Well, this means that only libvirt can use the resctrlfs filesystem.
> > >  
> > > This forbids other applications which require allocation
> > > of CAT resources from being used.
> > >  
> > > Its simple to fix it: move the scanning of resctrlfs data from libvirt
> > > initialization time to:
> > >  
> > > - VM initialization time
> > > and
> > > - virConnectGetCapabilities time
> > >  
> > > The second case is necessary to get updated free space information.
> >  
> > Just VM initialization time could be enough as virConnectGetCapabilities
> > would just know the total and free size would be reported in an API (if
> > I rememer the discussion correctly)
> >  
> >  
> 
> 
> By doing this, it will slowdown the process of booting a VM if we scan it every time...  

This is OK, if performance turns out to be a problem the code can 
be optimized.

But first it has to be implemented correctly, which means supporting
other users of resctrl filesystem (see the OVS-DPDK example above).

> >  
> > Martin
> >  
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Marcelo Tosatti 7 years, 2 months ago
On Tue, Feb 14, 2017 at 09:37:02AM +0100, Martin Kletzander wrote:
> On Mon, Feb 13, 2017 at 04:09:13PM -0200, Marcelo Tosatti wrote:
> >On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
> >>> > L3data:0=0xff00;...
> >>> > L3code:0=0xff00;...
> >>> >
> >>> > * Would you please help to test if the functions work.
> >>>
> >>> Setting up CDP machine.
> >>>
> >>> Unrelated:
> >>>
> >>> Found a bug:
> >>>
> >>> The code should scan for all directories in resctrlfs,
> >>> and then find free CBM space from that:
> >>>
> >>>
> >>> free_cbm_space = ~(resctrldir1.CBM_bits &
> >>> resctrldir2.CBM_bits &
> >>> ...
> >>> resctrldirN.CBM_bits)
> >>>
> >>> For all resctrlfs directories.
> >>>
> >>> The bug is as follows:
> >>>
> >>> Create a directory in resctrlfs by hand:
> >>>
> >>> # mkdir newres
> >>
> >>Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.
> >
> >Are you saying that usage of the resctrlfs filesystem should not be
> >allowed after libvirt starts? I don't think this is correct.
> >
> 
> In some cases the only way to accomplish keeping consistency would be
> saying that if you are using libvirt, you should not change anything
> behind its back.  It would be really wrong in this case, mainly when you
> need to set the allocation for other apps, especially when there is
> nicely designed file that you can lock.
> 
> >>we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.
> >
> 
> Definitely not.  What if someone wants to create another allocation?
> They start by creating a directory, then libvirtd removes it before they
> manage to add anything to tasks.

If the other application is properly using the filesystem lock, then: 

OTHER APP, CREATE PROCEDURE:

	1. grab fs lock.
	2. create directory.
	3. write schemata.
	4. write tasks.
	5. release fs lock

Any operation that writes to any file in resctrlfs will first grab 
the lock. So the problem being discussed is handled.

> >
> >
> >>
> >>
> >>> # cd newres
> >>> # echo "L3:0=3;1=f0000" > schemata
> >>> # virsh start guest
> >>> # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
> >>> # cat schemata
> >>> L3:0=3;1=f0000
> >>>
> >>> That is, it is using the same CBM space as the "newres"
> >>> reservation.
> >>>
> >>>
> >>
> >>
> >>As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.
> >>
> >>That will lead mess, this should be forbidden.
> >
> >Well, this means that only libvirt can use the resctrlfs filesystem.
> >
> >This forbids other applications which require allocation
> >of CAT resources from being used.
> >
> >Its simple to fix it: move the scanning of resctrlfs data from libvirt
> >initialization time to:
> >
> >	- VM initialization time
> >	and
> >	- virConnectGetCapabilities time
> >
> >The second case is necessary to get updated free space information.
> >
> 
> Just VM initialization time could be enough as virConnectGetCapabilities
> would just know the total and free size would be reported in an API (if
> I rememer the discussion correctly)

Ok so resctrlfs has to be rescanned in whatever location the
free size is reported to the user.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Eli Qiao 7 years, 2 months ago

--  
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Wednesday, 15 February 2017 at 1:14 AM, Marcelo Tosatti wrote:

> On Tue, Feb 14, 2017 at 09:37:02AM +0100, Martin Kletzander wrote:
> > On Mon, Feb 13, 2017 at 04:09:13PM -0200, Marcelo Tosatti wrote:
> > > On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
> > > > > > L3data:0=0xff00;...
> > > > > > L3code:0=0xff00;...
> > > > > >  
> > > > > > * Would you please help to test if the functions work.
> > > > >  
> > > > > Setting up CDP machine.
> > > > >  
> > > > > Unrelated:
> > > > >  
> > > > > Found a bug:
> > > > >  
> > > > > The code should scan for all directories in resctrlfs,
> > > > > and then find free CBM space from that:
> > > > >  
> > > > >  
> > > > > free_cbm_space = ~(resctrldir1.CBM_bits &
> > > > > resctrldir2.CBM_bits &
> > > > > ...
> > > > > resctrldirN.CBM_bits)
> > > > >  
> > > > > For all resctrlfs directories.
> > > > >  
> > > > > The bug is as follows:
> > > > >  
> > > > > Create a directory in resctrlfs by hand:
> > > > >  
> > > > > # mkdir newres
> > > >  
> > > > Libvirt will not aware this after it starts running, so we should not allow operate /sys/fs/.
> > >  
> > > Are you saying that usage of the resctrlfs filesystem should not be
> > > allowed after libvirt starts? I don't think this is correct.
> > >  
> >  
> >  
> > In some cases the only way to accomplish keeping consistency would be
> > saying that if you are using libvirt, you should not change anything
> > behind its back. It would be really wrong in this case, mainly when you
> > need to set the allocation for other apps, especially when there is
> > nicely designed file that you can lock.
> >  
> > > > we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.
> >  
> > Definitely not. What if someone wants to create another allocation?
> > They start by creating a directory, then libvirtd removes it before they
> > manage to add anything to tasks.
> >  
>  
>  
> If the other application is properly using the filesystem lock, then:  
>  
> OTHER APP, CREATE PROCEDURE:
>  
> 1. grab fs lock.
> 2. create directory.
> 3. write schemata.
> 4. write tasks.
> 5. release fs lock
>  
> Any operation that writes to any file in resctrlfs will first grab  
> the lock. So the problem being discussed is handled.
>  
Sure, as long as the application can manage the schemata well.  
>  
> > >  
> > >  
> > > >  
> > > >  
> > > > > # cd newres
> > > > > # echo "L3:0=3;1=f0000" > schemata
> > > > > # virsh start guest
> > > > > # cd ../b4c270b5-e0f9-4106-a446-69032872ed7e
> > > > > # cat schemata
> > > > > L3:0=3;1=f0000
> > > > >  
> > > > > That is, it is using the same CBM space as the "newres"
> > > > > reservation.
> > > > >  
> > > >  
> > > >  
> > > >  
> > > > As user create a new directory after libvirt running, it don’t notice newly created directory under /sys/fs/resctrl.
> > > >  
> > > > That will lead mess, this should be forbidden.
> > >  
> > > Well, this means that only libvirt can use the resctrlfs filesystem.
> > >  
> > > This forbids other applications which require allocation
> > > of CAT resources from being used.
> > >  
> > > Its simple to fix it: move the scanning of resctrlfs data from libvirt
> > > initialization time to:
> > >  
> > > - VM initialization time
> > > and
> > > - virConnectGetCapabilities time
> > >  
> > > The second case is necessary to get updated free space information.
> >  
> > Just VM initialization time could be enough as virConnectGetCapabilities
> > would just know the total and free size would be reported in an API (if
> > I rememer the discussion correctly)
> >  
>  
>  
> Ok so resctrlfs has to be rescanned in whatever location the
> free size is reported to the user.
>  
>  

Sure, will add this when adding cache free API.  

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [v4 0/9] Support cache tune in libvirt
Posted by Martin Kletzander 7 years, 2 months ago
On Wed, Feb 15, 2017 at 01:40:30PM +0800, Eli Qiao wrote:
>On Wednesday, 15 February 2017 at 1:14 AM, Marcelo Tosatti wrote:
>> On Tue, Feb 14, 2017 at 09:37:02AM +0100, Martin Kletzander wrote:
>> > On Mon, Feb 13, 2017 at 04:09:13PM -0200, Marcelo Tosatti wrote:
>> > > On Mon, Feb 13, 2017 at 03:42:04PM +0800, Eli Qiao wrote:
>> > > > we will scan for all directors while the libvirt daemon begin running, and libvirt will remove exist directories if no tasks inside of it.
>> >
>> > Definitely not. What if someone wants to create another allocation?
>> > They start by creating a directory, then libvirtd removes it before they
>> > manage to add anything to tasks.
>> >
>>
>>
>> If the other application is properly using the filesystem lock, then:
>>
>> OTHER APP, CREATE PROCEDURE:
>>
>> 1. grab fs lock.
>> 2. create directory.
>> 3. write schemata.
>> 4. write tasks.
>> 5. release fs lock
>>
>> Any operation that writes to any file in resctrlfs will first grab
>> the lock. So the problem being discussed is handled.
>>

Yes, my point was that this is yet another reason why we *must* use the
lock.  Then it's fine.  I should've been more clear, sorry.
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list