[PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU

Julio Faracco posted 3 patches 3 years, 1 month ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20210227040635.73934-1-jcfaracco@gmail.com
meson.build              |  10 +
meson_options.txt        |   1 +
src/lxc/lxc_conf.c       |   3 +
src/lxc/lxc_conf.h       |   2 +
src/lxc/lxc_container.c  | 188 +++++++++++++++++-
src/lxc/lxc_container.h  |   3 +-
src/lxc/lxc_controller.c |  93 ++++++++-
src/lxc/lxc_criu.c       | 405 +++++++++++++++++++++++++++++++++++++++
src/lxc/lxc_criu.h       |  50 +++++
src/lxc/lxc_driver.c     | 341 +++++++++++++++++++++++++++++++-
src/lxc/lxc_process.c    |  26 ++-
src/lxc/lxc_process.h    |   1 +
src/lxc/meson.build      |   2 +
13 files changed, 1106 insertions(+), 19 deletions(-)
create mode 100644 src/lxc/lxc_criu.c
create mode 100644 src/lxc/lxc_criu.h
[PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU
Posted by Julio Faracco 3 years, 1 month ago
This patch series implements a way to do checkpoint/restore to LXC driver using
CRIU operations. This respects the other methods to save and restore processes
states: using a file with a header with some metadata. The only difference here
is basically the way LXC drivers join the files produced by CRIU. CRIU generates
a lots of 'img' files and it is compresses using TAR to fit into the libvirt
state file.

Julio Faracco (3):
  meson: Add support to CRIU binary into meson
  lxc: Including CRIU functions and functions to support C/R.
  lxc: Adding support to LXC driver to restore a container

 meson.build              |  10 +
 meson_options.txt        |   1 +
 src/lxc/lxc_conf.c       |   3 +
 src/lxc/lxc_conf.h       |   2 +
 src/lxc/lxc_container.c  | 188 +++++++++++++++++-
 src/lxc/lxc_container.h  |   3 +-
 src/lxc/lxc_controller.c |  93 ++++++++-
 src/lxc/lxc_criu.c       | 405 +++++++++++++++++++++++++++++++++++++++
 src/lxc/lxc_criu.h       |  50 +++++
 src/lxc/lxc_driver.c     | 341 +++++++++++++++++++++++++++++++-
 src/lxc/lxc_process.c    |  26 ++-
 src/lxc/lxc_process.h    |   1 +
 src/lxc/meson.build      |   2 +
 13 files changed, 1106 insertions(+), 19 deletions(-)
 create mode 100644 src/lxc/lxc_criu.c
 create mode 100644 src/lxc/lxc_criu.h

-- 
2.27.0

Re: [PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU
Posted by Julio Faracco 3 years, 1 month ago
Hi guys,

I marked this series as RFC to discuss some points. I'm interested in
enhancing this specific part of LXC. So, some questions that I would
like to hear as a feedback from community:
1. I decided to use a tar to compress all CRIU img files into a single
file. Any other suggestions?
2. If no is the answer to question above, is there a consensus on
preferring to use command line calls or libraries? I would like to use
libtar for instance. I personally think that this approach is ugly.
Not sure if I'm able to do that. The same for CRIU.
3. Other important opinions obviously.

--
Julio Cesar Faracco

Em sáb., 27 de fev. de 2021 às 01:06, Julio Faracco
<jcfaracco@gmail.com> escreveu:
>
> This patch series implements a way to do checkpoint/restore to LXC driver using
> CRIU operations. This respects the other methods to save and restore processes
> states: using a file with a header with some metadata. The only difference here
> is basically the way LXC drivers join the files produced by CRIU. CRIU generates
> a lots of 'img' files and it is compresses using TAR to fit into the libvirt
> state file.
>
> Julio Faracco (3):
>   meson: Add support to CRIU binary into meson
>   lxc: Including CRIU functions and functions to support C/R.
>   lxc: Adding support to LXC driver to restore a container
>
>  meson.build              |  10 +
>  meson_options.txt        |   1 +
>  src/lxc/lxc_conf.c       |   3 +
>  src/lxc/lxc_conf.h       |   2 +
>  src/lxc/lxc_container.c  | 188 +++++++++++++++++-
>  src/lxc/lxc_container.h  |   3 +-
>  src/lxc/lxc_controller.c |  93 ++++++++-
>  src/lxc/lxc_criu.c       | 405 +++++++++++++++++++++++++++++++++++++++
>  src/lxc/lxc_criu.h       |  50 +++++
>  src/lxc/lxc_driver.c     | 341 +++++++++++++++++++++++++++++++-
>  src/lxc/lxc_process.c    |  26 ++-
>  src/lxc/lxc_process.h    |   1 +
>  src/lxc/meson.build      |   2 +
>  13 files changed, 1106 insertions(+), 19 deletions(-)
>  create mode 100644 src/lxc/lxc_criu.c
>  create mode 100644 src/lxc/lxc_criu.h
>
> --
> 2.27.0
>


Re: [PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU
Posted by Martin Kletzander 3 years ago
On Sat, Feb 27, 2021 at 01:14:29AM -0300, Julio Faracco wrote:
>Hi guys,
>

Hi and sorry for not replying earlier.

>I marked this series as RFC to discuss some points. I'm interested in
>enhancing this specific part of LXC. So, some questions that I would
>like to hear as a feedback from community:
>1. I decided to use a tar to compress all CRIU img files into a single
>file. Any other suggestions?
>2. If no is the answer to question above, is there a consensus on
>preferring to use command line calls or libraries? I would like to use
>libtar for instance. I personally think that this approach is ugly.
>Not sure if I'm able to do that. The same for CRIU.

I remember that for CRIU, back when we were trying to do that, the issue
was that the commands were not atomic, did not properly report error
messages and maybe something more along the lines.  Either there was no
library interface or it was not MT-safe, basically there were couple of
issues like that which we were not able to deal with.

I do not really remember all the details.  Maybe Michal does as I think
he suggested the idea back then.  I Cc'd him.  In the worst scenario we
will need to figure this all out again ;)
Re: [PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU
Posted by Michal Privoznik 3 years ago
On 4/1/21 12:01 AM, Martin Kletzander wrote:
> On Sat, Feb 27, 2021 at 01:14:29AM -0300, Julio Faracco wrote:
>> Hi guys,
>>
> 
> Hi and sorry for not replying earlier.
> 

Yeah, sorry. I have this marked for review and yet still haven't done so.

>> I marked this series as RFC to discuss some points. I'm interested in
>> enhancing this specific part of LXC. So, some questions that I would
>> like to hear as a feedback from community:
>> 1. I decided to use a tar to compress all CRIU img files into a single
>> file. Any other suggestions?
>> 2. If no is the answer to question above, is there a consensus on
>> preferring to use command line calls or libraries? I would like to use
>> libtar for instance. I personally think that this approach is ugly.
>> Not sure if I'm able to do that. The same for CRIU.
> 
> I remember that for CRIU, back when we were trying to do that, the issue
> was that the commands were not atomic, did not properly report error
> messages and maybe something more along the lines.  Either there was no
> library interface or it was not MT-safe, basically there were couple of
> issues like that which we were not able to deal with.
> 
> I do not really remember all the details.  Maybe Michal does as I think
> he suggested the idea back then.  I Cc'd him.  In the worst scenario we
> will need to figure this all out again ;)

IIRC the main problem was that we wanted CRIU to be able to send its 
data over a TCP connection. Back then, when a GSoC student was looking 
at this, CRIU was only able to store data into a file (or even multiple 
files in a directory?) and wasn't able to create server/client 
connection. Maybe this has changed since then? If not, then we can use 
tar, sure. And to transfer data we can use so called tunnelled 
migration, where the migration stream is sent over libvirt connection 
rather than directly to the other side (because then we would have to 
have nc or similar involved).

https://libvirt.org/migration.html#transporttunnel

Another issue was that it couldn't handle all namespaces (but I'm not 
certain - it was 5 years ago).

But let me find some time and review patches.

Michal

Re: [PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU
Posted by Julio Faracco 3 years ago
Hi Michal and Martin,

Thanks for your reply.
Just an explanation. I'm not interested directly in developing this
specific feature.
If there is a GSoC student addressed to this... Excellent.
I'm interested in developing snapshot and container migration which
unfortunately requires this feature.
Unless you have another opinion.

--
Julio Faracco

Em qui., 1 de abr. de 2021 às 07:33, Michal Privoznik
<mprivozn@redhat.com> escreveu:
>
> On 4/1/21 12:01 AM, Martin Kletzander wrote:
> > On Sat, Feb 27, 2021 at 01:14:29AM -0300, Julio Faracco wrote:
> >> Hi guys,
> >>
> >
> > Hi and sorry for not replying earlier.
> >
>
> Yeah, sorry. I have this marked for review and yet still haven't done so.
>
> >> I marked this series as RFC to discuss some points. I'm interested in
> >> enhancing this specific part of LXC. So, some questions that I would
> >> like to hear as a feedback from community:
> >> 1. I decided to use a tar to compress all CRIU img files into a single
> >> file. Any other suggestions?
> >> 2. If no is the answer to question above, is there a consensus on
> >> preferring to use command line calls or libraries? I would like to use
> >> libtar for instance. I personally think that this approach is ugly.
> >> Not sure if I'm able to do that. The same for CRIU.
> >
> > I remember that for CRIU, back when we were trying to do that, the issue
> > was that the commands were not atomic, did not properly report error
> > messages and maybe something more along the lines.  Either there was no
> > library interface or it was not MT-safe, basically there were couple of
> > issues like that which we were not able to deal with.
> >
> > I do not really remember all the details.  Maybe Michal does as I think
> > he suggested the idea back then.  I Cc'd him.  In the worst scenario we
> > will need to figure this all out again ;)
>
> IIRC the main problem was that we wanted CRIU to be able to send its
> data over a TCP connection. Back then, when a GSoC student was looking
> at this, CRIU was only able to store data into a file (or even multiple
> files in a directory?) and wasn't able to create server/client
> connection. Maybe this has changed since then? If not, then we can use
> tar, sure. And to transfer data we can use so called tunnelled
> migration, where the migration stream is sent over libvirt connection
> rather than directly to the other side (because then we would have to
> have nc or similar involved).
>
> https://libvirt.org/migration.html#transporttunnel
>
> Another issue was that it couldn't handle all namespaces (but I'm not
> certain - it was 5 years ago).
>
> But let me find some time and review patches.
>
> Michal
>


Re: [PATCH RFC 0/3] Add checkpoint/restore support to LXC using CRIU
Posted by Martin Kletzander 3 years ago
On Thu, Apr 01, 2021 at 10:16:36AM -0300, Julio Faracco wrote:
>Hi Michal and Martin,
>
>Thanks for your reply.
>Just an explanation. I'm not interested directly in developing this
>specific feature.
>If there is a GSoC student addressed to this... Excellent.
>I'm interested in developing snapshot and container migration which
>unfortunately requires this feature.
>Unless you have another opinion.
>

That student was working on it, but that was 5 years ago as Michal said.
I'm afraid that it's either you or nobody who does that =)

>--
>Julio Faracco
>
>Em qui., 1 de abr. de 2021 às 07:33, Michal Privoznik
><mprivozn@redhat.com> escreveu:
>>
>> On 4/1/21 12:01 AM, Martin Kletzander wrote:
>> > On Sat, Feb 27, 2021 at 01:14:29AM -0300, Julio Faracco wrote:
>> >> Hi guys,
>> >>
>> >
>> > Hi and sorry for not replying earlier.
>> >
>>
>> Yeah, sorry. I have this marked for review and yet still haven't done so.
>>
>> >> I marked this series as RFC to discuss some points. I'm interested in
>> >> enhancing this specific part of LXC. So, some questions that I would
>> >> like to hear as a feedback from community:
>> >> 1. I decided to use a tar to compress all CRIU img files into a single
>> >> file. Any other suggestions?
>> >> 2. If no is the answer to question above, is there a consensus on
>> >> preferring to use command line calls or libraries? I would like to use
>> >> libtar for instance. I personally think that this approach is ugly.
>> >> Not sure if I'm able to do that. The same for CRIU.
>> >
>> > I remember that for CRIU, back when we were trying to do that, the issue
>> > was that the commands were not atomic, did not properly report error
>> > messages and maybe something more along the lines.  Either there was no
>> > library interface or it was not MT-safe, basically there were couple of
>> > issues like that which we were not able to deal with.
>> >
>> > I do not really remember all the details.  Maybe Michal does as I think
>> > he suggested the idea back then.  I Cc'd him.  In the worst scenario we
>> > will need to figure this all out again ;)
>>
>> IIRC the main problem was that we wanted CRIU to be able to send its
>> data over a TCP connection. Back then, when a GSoC student was looking
>> at this, CRIU was only able to store data into a file (or even multiple
>> files in a directory?) and wasn't able to create server/client
>> connection. Maybe this has changed since then? If not, then we can use
>> tar, sure. And to transfer data we can use so called tunnelled
>> migration, where the migration stream is sent over libvirt connection
>> rather than directly to the other side (because then we would have to
>> have nc or similar involved).
>>
>> https://libvirt.org/migration.html#transporttunnel
>>
>> Another issue was that it couldn't handle all namespaces (but I'm not
>> certain - it was 5 years ago).
>>
>> But let me find some time and review patches.
>>
>> Michal
>>
>