[libvirt] [PATCH] RFC: security: Make sure to decrease ref count label value

Stefan Berger posted 1 patch 4 years, 9 months ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20190726183822.1895339-1-stefanb@linux.vnet.ibm.com
src/security/security_selinux.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
[libvirt] [PATCH] RFC: security: Make sure to decrease ref count label value
Posted by Stefan Berger 4 years, 9 months ago
I noticed that if a domain fails to restore, the ref count in the xattr
'trusted.libvirt.security.ref_selinux' keeps on increasing indefinitely
and the VM will never restore even if the root cause for the restore
failure has been removed. The reason seems to be that the code to decrease
the ref count never gets called because the block above it fails due
to virSecuritySELinuxTransactionAppend() failing. The simple solution
seems to be to revert the order in which things are done.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 src/security/security_selinux.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c
index ea20373a90..9fd29e9bca 100644
--- a/src/security/security_selinux.c
+++ b/src/security/security_selinux.c
@@ -1499,14 +1499,9 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr,
         goto cleanup;
     }
 
-    if ((rc = virSecuritySELinuxTransactionAppend(path, NULL,
-                                                  false, recall, true)) < 0) {
-        goto cleanup;
-    } else if (rc > 0) {
-        ret = 0;
-        goto cleanup;
-    }
-
+    /* Recall the label so the ref count label decreases its counter
+     * even if transaction append below fails.
+     */
     if (recall) {
         rc = virSecuritySELinuxRecallLabel(newpath, &fcon);
         if (rc == -2) {
@@ -1519,6 +1514,14 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr,
         }
     }
 
+    if ((rc = virSecuritySELinuxTransactionAppend(path, NULL,
+                                                  false, recall, true)) < 0) {
+        goto cleanup;
+    } else if (rc > 0) {
+        ret = 0;
+        goto cleanup;
+    }
+
     if (!recall || rc == -2) {
         if (stat(newpath, &buf) != 0) {
             VIR_WARN("cannot stat %s: %s", newpath,
-- 
2.20.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] RFC: security: Make sure to decrease ref count label value
Posted by Michal Privoznik 4 years, 9 months ago
On 7/26/19 8:38 PM, Stefan Berger wrote:
> I noticed that if a domain fails to restore, the ref count in the xattr
> 'trusted.libvirt.security.ref_selinux' keeps on increasing indefinitely
> and the VM will never restore even if the root cause for the restore
> failure has been removed. The reason seems to be that the code to decrease
> the ref count never gets called because the block above it fails due
> to virSecuritySELinuxTransactionAppend() failing. The simple solution
> seems to be to revert the order in which things are done.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>   src/security/security_selinux.c | 19 +++++++++++--------
>   1 file changed, 11 insertions(+), 8 deletions(-)

So what was the root cause, did you ever find out?

> 
> diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c
> index ea20373a90..9fd29e9bca 100644
> --- a/src/security/security_selinux.c
> +++ b/src/security/security_selinux.c
> @@ -1499,14 +1499,9 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr,
>           goto cleanup;
>       }
>   
> -    if ((rc = virSecuritySELinuxTransactionAppend(path, NULL,
> -                                                  false, recall, true)) < 0) {
> -        goto cleanup;
> -    } else if (rc > 0) {
> -        ret = 0;
> -        goto cleanup;
> -    }
> -
> +    /* Recall the label so the ref count label decreases its counter
> +     * even if transaction append below fails.
> +     */
>       if (recall) {
>           rc = virSecuritySELinuxRecallLabel(newpath, &fcon);
>           if (rc == -2) {
> @@ -1519,6 +1514,14 @@ virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr,
>           }
>       }
>   
> +    if ((rc = virSecuritySELinuxTransactionAppend(path, NULL,
> +                                                  false, recall, true)) < 0) {
> +        goto cleanup;
> +    } else if (rc > 0) {
> +        ret = 0;
> +        goto cleanup;
> +    }
> +
>       if (!recall || rc == -2) {
>           if (stat(newpath, &buf) != 0) {
>               VIR_WARN("cannot stat %s: %s", newpath,
> 

IIUC, this gets then called twice, doesn't it? I mean, transactions are 
hackish a bit. We wanted chown() and setfilecon_raw() to run inside the 
mount namespace of a domain. So instead of rewriting our secdrivers and 
making them enter the namespace on every chown()/setfilecon_raw() we put 
small code snippets just before these calls that would record arguments 
for these functions in a thread local variable and then 
transactionCommit() would enter the namespace and replay all 
chown()/setfilecon_raw().

That's why there can't be any code before transactionAppend(), because 
it would be called twice - once from regular call and from 
transactionCommit(). For instance:

virSecuritySELinuxDomainSetPathLabel()
  -> virSecuritySELinuxSetFilecon()
    -> virSecuritySELinuxSetFileconHelper()
      -> virSecuritySELinuxTransactionAppend()

virSecuritySELinuxTransactionCommit()
   -> virSecuritySELinuxTransactionRun()
     -> virSecuritySELinuxSetFileconHelper()
       -> virSecuritySELinuxSetFileconImpl()
         -> setfilecon_raw()

We need to find the root cause. Maybe our set is not on par with 
restore, i.e. we are calling more set than restore and thus the 
refcounter just keeps growing?

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] RFC: security: Make sure to decrease ref count label value
Posted by Stefan Berger 4 years, 8 months ago
On 7/29/19 5:17 AM, Michal Privoznik wrote:
> On 7/26/19 8:38 PM, Stefan Berger wrote:
>> I noticed that if a domain fails to restore, the ref count in the xattr
>> 'trusted.libvirt.security.ref_selinux' keeps on increasing indefinitely
>> and the VM will never restore even if the root cause for the restore
>> failure has been removed. The reason seems to be that the code to 
>> decrease
>> the ref count never gets called because the block above it fails due
>> to virSecuritySELinuxTransactionAppend() failing. The simple solution
>> seems to be to revert the order in which things are done.
>>
>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>> ---
>>   src/security/security_selinux.c | 19 +++++++++++--------
>>   1 file changed, 11 insertions(+), 8 deletions(-)
>
> So what was the root cause, did you ever find out?


The root cause of the failure was basically me (intentionally) changing 
the password for the encrypted vTPM to a wrong password and then QEMU 
cannot *resume* 'swtpm' since it cannot decrypt the state. The operation 
on virt-manager level was a restore of a saved VM.


>
>>
>> diff --git a/src/security/security_selinux.c 
>> b/src/security/security_selinux.c
>> index ea20373a90..9fd29e9bca 100644
>> --- a/src/security/security_selinux.c
>> +++ b/src/security/security_selinux.c
>> @@ -1499,14 +1499,9 @@ 
>> virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr,
>>           goto cleanup;
>>       }
>>   -    if ((rc = virSecuritySELinuxTransactionAppend(path, NULL,
>> -                                                  false, recall, 
>> true)) < 0) {
>> -        goto cleanup;
>> -    } else if (rc > 0) {
>> -        ret = 0;
>> -        goto cleanup;
>> -    }
>> -
>> +    /* Recall the label so the ref count label decreases its counter
>> +     * even if transaction append below fails.
>> +     */
>>       if (recall) {
>>           rc = virSecuritySELinuxRecallLabel(newpath, &fcon);
>>           if (rc == -2) {
>> @@ -1519,6 +1514,14 @@ 
>> virSecuritySELinuxRestoreFileLabel(virSecurityManagerPtr mgr,
>>           }
>>       }
>>   +    if ((rc = virSecuritySELinuxTransactionAppend(path, NULL,
>> +                                                  false, recall, 
>> true)) < 0) {
>> +        goto cleanup;
>> +    } else if (rc > 0) {
>> +        ret = 0;
>> +        goto cleanup;
>> +    }
>> +
>>       if (!recall || rc == -2) {
>>           if (stat(newpath, &buf) != 0) {
>>               VIR_WARN("cannot stat %s: %s", newpath,
>>
>
> IIUC, this gets then called twice, doesn't it? I mean, transactions 
> are hackish a bit. We wanted chown() and setfilecon_raw() to run 
> inside the mount namespace of a domain. So instead of rewriting our 
> secdrivers and making them enter the namespace on every 
> chown()/setfilecon_raw() we put small code snippets just before these 
> calls that would record arguments for these functions in a thread 
> local variable and then transactionCommit() would enter the namespace 
> and replay all chown()/setfilecon_raw().
>
> That's why there can't be any code before transactionAppend(), because 
> it would be called twice - once from regular call and from 
> transactionCommit(). For instance:
>
> virSecuritySELinuxDomainSetPathLabel()
>  -> virSecuritySELinuxSetFilecon()
>    -> virSecuritySELinuxSetFileconHelper()
>      -> virSecuritySELinuxTransactionAppend()
>
> virSecuritySELinuxTransactionCommit()
>   -> virSecuritySELinuxTransactionRun()
>     -> virSecuritySELinuxSetFileconHelper()
>       -> virSecuritySELinuxSetFileconImpl()
>         -> setfilecon_raw()
>
> We need to find the root cause. Maybe our set is not on par with 
> restore, i.e. we are calling more set than restore and thus the 
> refcounter just keeps growing?


The root cause was as described above and I wanted to see how QEMU+SWTPM 
react if I do this.


What I noticed was that the ...TransactionAppend() failed and didn't get 
to the code to decrease the ref count, which then ended up increasing 
forever. So I created this patch, which did indeed solve this particular 
problem but then caused Signal 11's once I provoked an error with 
starting (rather than resuming) a VM where again I had changed the 
password (iirc).


    Stefan


>
> Michal
>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] RFC: security: Make sure to decrease ref count label value
Posted by Michal Privoznik 4 years, 8 months ago
On 8/1/19 7:45 PM, Stefan Berger wrote:
> On 7/29/19 5:17 AM, Michal Privoznik wrote:
>> On 7/26/19 8:38 PM, Stefan Berger wrote:
>>> I noticed that if a domain fails to restore, the ref count in the xattr
>>> 'trusted.libvirt.security.ref_selinux' keeps on increasing indefinitely
>>> and the VM will never restore even if the root cause for the restore
>>> failure has been removed. The reason seems to be that the code to 
>>> decrease
>>> the ref count never gets called because the block above it fails due
>>> to virSecuritySELinuxTransactionAppend() failing. The simple solution
>>> seems to be to revert the order in which things are done.
>>>
>>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>>> ---
>>>   src/security/security_selinux.c | 19 +++++++++++--------
>>>   1 file changed, 11 insertions(+), 8 deletions(-)
>>
>> So what was the root cause, did you ever find out?
> 
> 
> The root cause of the failure was basically me (intentionally) changing 
> the password for the encrypted vTPM to a wrong password and then QEMU 
> cannot *resume* 'swtpm' since it cannot decrypt the state. The operation 
> on virt-manager level was a restore of a saved VM.
 >
 > <snip/>
> 
> The root cause was as described above and I wanted to see how QEMU+SWTPM 
> react if I do this.
> 
> 
> What I noticed was that the ...TransactionAppend() failed and didn't get 
> to the code to decrease the ref count, which then ended up increasing 
> forever. So I created this patch, which did indeed solve this particular 
> problem but then caused Signal 11's once I provoked an error with 
> starting (rather than resuming) a VM where again I had changed the 
> password (iirc).
> 

Ah, so TransactionAppend() failed? Huh, looking into the code that can 
happen only in case of OOM at which case all the bets are off IMO. But 
then again, getting malloc() to return NULL is not easy and certainly 
won't happen in the default setup. So I'm wondering what else could make 
TransactionAppend() fail.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list