[libvirt] [PATCH] util: xml: Don't conflict with other libxml2 user callbacks

Cole Robinson posted 1 patch 6 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/5f7378fda41cb0f60ce1a0c59fb3717d04556d45.1519428075.git.crobinso@redhat.com
Test syntax-check passed
src/util/virxml.c | 5 +++++
1 file changed, 5 insertions(+)
[libvirt] [PATCH] util: xml: Don't conflict with other libxml2 user callbacks
Posted by Cole Robinson 6 years, 2 months ago
lxml is a popular python XML processing library. It uses libxml2
behind the scenes, and registers custom callbacks via
xmlSetExternalEntityLoader. However this can cause crashes if
if an app uses both lxml and libxml2 together in the same process.

This is a known limitation of lxml and libxml2 generally. It also
prevents us from using lxml in virt-manager:

https://bugzilla.redhat.com/show_bug.cgi?id=1544019

However it's easy enough to work around in libvirt, by unsetting the
EntityLoader callback to a known state before we ask libxml2 to
parse a file from disk.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
---
 src/util/virxml.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/src/util/virxml.c b/src/util/virxml.c
index 6e87605ea..3e01794f9 100644
--- a/src/util/virxml.c
+++ b/src/util/virxml.c
@@ -810,9 +810,14 @@ virXMLParseHelper(int domcode,
     pctxt->sax->error = catchXMLError;
 
     if (filename) {
+        /* Reset any libxml2 file callbacks, other libs (like python lxml)
+         * may have set their own which can get crashy */
+        xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader();
+        xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader);
         xml = xmlCtxtReadFile(pctxt, filename, NULL,
                               XML_PARSE_NONET |
                               XML_PARSE_NOWARNING);
+        xmlSetExternalEntityLoader(origloader);
     } else {
         xml = xmlCtxtReadDoc(pctxt, BAD_CAST xmlStr, url, NULL,
                              XML_PARSE_NONET |
-- 
2.14.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] util: xml: Don't conflict with other libxml2 user callbacks
Posted by Ján Tomko 6 years, 1 month ago
On Fri, Feb 23, 2018 at 06:21:15PM -0500, Cole Robinson wrote:
>lxml is a popular python XML processing library. It uses libxml2
>behind the scenes, and registers custom callbacks via
>xmlSetExternalEntityLoader. However this can cause crashes if
>if an app uses both lxml and libxml2 together in the same process.
>
>This is a known limitation of lxml and libxml2 generally. It also
>prevents us from using lxml in virt-manager:
>
>https://bugzilla.redhat.com/show_bug.cgi?id=1544019
>
>However it's easy enough to work around in libvirt, by unsetting the
>EntityLoader callback to a known state before we ask libxml2 to
>parse a file from disk.
>
>Signed-off-by: Cole Robinson <crobinso@redhat.com>
>---
> src/util/virxml.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
>diff --git a/src/util/virxml.c b/src/util/virxml.c
>index 6e87605ea..3e01794f9 100644
>--- a/src/util/virxml.c
>+++ b/src/util/virxml.c
>@@ -810,9 +810,14 @@ virXMLParseHelper(int domcode,
>     pctxt->sax->error = catchXMLError;
>
>     if (filename) {
>+        /* Reset any libxml2 file callbacks, other libs (like python lxml)
>+         * may have set their own which can get crashy */
>+        xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader();
>+        xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader);
>         xml = xmlCtxtReadFile(pctxt, filename, NULL,
>                               XML_PARSE_NONET |
>                               XML_PARSE_NOWARNING);
>+        xmlSetExternalEntityLoader(origloader);

This does not look thread-safe at all - what if two libvirt threads
try to parse some XML at the same time?

Jan

>     } else {
>         xml = xmlCtxtReadDoc(pctxt, BAD_CAST xmlStr, url, NULL,
>                              XML_PARSE_NONET |
>-- 
>2.14.3
>
>--
>libvir-list mailing list
>libvir-list@redhat.com
>https://www.redhat.com/mailman/listinfo/libvir-list
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] util: xml: Don't conflict with other libxml2 user callbacks
Posted by Daniel P. Berrangé 6 years, 1 month ago
On Mon, Feb 26, 2018 at 09:48:39AM +0100, Ján Tomko wrote:
> On Fri, Feb 23, 2018 at 06:21:15PM -0500, Cole Robinson wrote:
> > lxml is a popular python XML processing library. It uses libxml2
> > behind the scenes, and registers custom callbacks via
> > xmlSetExternalEntityLoader. However this can cause crashes if
> > if an app uses both lxml and libxml2 together in the same process.
> > 
> > This is a known limitation of lxml and libxml2 generally. It also
> > prevents us from using lxml in virt-manager:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1544019
> > 
> > However it's easy enough to work around in libvirt, by unsetting the
> > EntityLoader callback to a known state before we ask libxml2 to
> > parse a file from disk.
> > 
> > Signed-off-by: Cole Robinson <crobinso@redhat.com>
> > ---
> > src/util/virxml.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> > 
> > diff --git a/src/util/virxml.c b/src/util/virxml.c
> > index 6e87605ea..3e01794f9 100644
> > --- a/src/util/virxml.c
> > +++ b/src/util/virxml.c
> > @@ -810,9 +810,14 @@ virXMLParseHelper(int domcode,
> >     pctxt->sax->error = catchXMLError;
> > 
> >     if (filename) {
> > +        /* Reset any libxml2 file callbacks, other libs (like python lxml)
> > +         * may have set their own which can get crashy */
> > +        xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader();
> > +        xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader);
> >         xml = xmlCtxtReadFile(pctxt, filename, NULL,
> >                               XML_PARSE_NONET |
> >                               XML_PARSE_NOWARNING);
> > +        xmlSetExternalEntityLoader(origloader);
> 
> This does not look thread-safe at all - what if two libvirt threads
> try to parse some XML at the same time?

Indeed, that is not thread safe - I checked the libxml code and it just
sets a static variable, not thread local.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list