[PATCH] replay: fix recursive checkpoints

Pavel Dovgalyuk posted 1 patch 3 years ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/161700476500.1140362.10108444973730452257.stgit@pasha-ThinkPad-X280
Maintainers: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>, Paolo Bonzini <pbonzini@redhat.com>
replay/replay.c |   11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
[PATCH] replay: fix recursive checkpoints
Posted by Pavel Dovgalyuk 3 years ago
Record/replay uses checkpoints to synchronize the execution
of the threads and timers. Hardware events such as BH are
processed at the checkpoints too.
Event processing can cause refreshing the virtual timers
and calling the icount-related functions, that also use checkpoints.
This patch prevents recursive processing of such checkpoints,
because they have their own records in the log and should be
processed later.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
---
 replay/replay.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/replay/replay.c b/replay/replay.c
index c806fec69a..6df2abc18c 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -180,12 +180,13 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
     }
 
     if (in_checkpoint) {
-        /* If we are already in checkpoint, then there is no need
-           for additional synchronization.
+        /*
            Recursion occurs when HW event modifies timers.
-           Timer modification may invoke the checkpoint and
-           proceed to recursion. */
-        return true;
+           Prevent performing icount warp in this case and
+           wait for another invocation of the checkpoint.
+        */
+        g_assert(replay_mode == REPLAY_MODE_PLAY);
+        return false;
     }
     in_checkpoint = true;
 


Re: [PATCH] replay: fix recursive checkpoints
Posted by Paolo Bonzini 3 years ago
On 29/03/21 09:59, Pavel Dovgalyuk wrote:
> Record/replay uses checkpoints to synchronize the execution
> of the threads and timers. Hardware events such as BH are
> processed at the checkpoints too.
> Event processing can cause refreshing the virtual timers
> and calling the icount-related functions, that also use checkpoints.
> This patch prevents recursive processing of such checkpoints,
> because they have their own records in the log and should be
> processed later.
> 
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
> ---
>   replay/replay.c |   11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/replay/replay.c b/replay/replay.c
> index c806fec69a..6df2abc18c 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -180,12 +180,13 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
>       }
>   
>       if (in_checkpoint) {
> -        /* If we are already in checkpoint, then there is no need
> -           for additional synchronization.
> +        /*
>              Recursion occurs when HW event modifies timers.
> -           Timer modification may invoke the checkpoint and
> -           proceed to recursion. */
> -        return true;
> +           Prevent performing icount warp in this case and
> +           wait for another invocation of the checkpoint.
> +        */
> +        g_assert(replay_mode == REPLAY_MODE_PLAY);
> +        return false;
>       }
>       in_checkpoint = true;
>   
> 

Queued, thanks.

Paolo


Re: [PATCH] replay: fix recursive checkpoints
Posted by Alex Bennée 3 years ago
Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes:

> Record/replay uses checkpoints to synchronize the execution
> of the threads and timers. Hardware events such as BH are
> processed at the checkpoints too.
> Event processing can cause refreshing the virtual timers
> and calling the icount-related functions, that also use checkpoints.
> This patch prevents recursive processing of such checkpoints,
> because they have their own records in the log and should be
> processed later.
>
> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
> ---
>  replay/replay.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/replay/replay.c b/replay/replay.c
> index c806fec69a..6df2abc18c 100644
> --- a/replay/replay.c
> +++ b/replay/replay.c
> @@ -180,12 +180,13 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
>      }
>  
>      if (in_checkpoint) {
> -        /* If we are already in checkpoint, then there is no need
> -           for additional synchronization.
> +        /*
>             Recursion occurs when HW event modifies timers.
> -           Timer modification may invoke the checkpoint and
> -           proceed to recursion. */
> -        return true;
> +           Prevent performing icount warp in this case and
> +           wait for another invocation of the checkpoint.
> +        */

nit: as you are updating the comment you might as well fix the style. It
would probably help with the diff as well.

> +        g_assert(replay_mode == REPLAY_MODE_PLAY);
> +        return false;
>      }
>      in_checkpoint = true;

The accompanying comments in replay.h are also confusing 

    Returns 0 in PLAY mode if checkpoint was not found.
    Returns 1 in all other cases.

Which translated to actual bool results:

    Returns false in PLAY mode if checkpoint was not found
    Returns true in all other cases

Which implies the checkpoint is always found (or created?) which I'm not
even sure of while following the rest of the replay_checkpoint code
which has exit cases of:

    bool res = false; (default)
    replay_state.data_kind != EVENT_ASYNC;
    res = true; (when recording)

So is the following more correct?

/**
 * replay_checkpoint(checkpoint): save (in RECORD) or consume (in PLAY) checkpoint
 * @checkpoint: the checkpoint event
 *
 * In SAVE mode stores the checkpoint in the record and potentially
 * saves a number of events.
 *
 * In PLAY mode consumes checkpoint and any following EVENT_ASYNC events.
 *
 * Results: in SAVE mode always True
 *          in PLAY mode True unless checkpoint not found or recursively called.
 */

-- 
Alex Bennée

Re: [PATCH] replay: fix recursive checkpoints
Posted by Pavel Dovgalyuk 2 years, 12 months ago
On 29.03.2021 14:25, Alex Bennée wrote:
> 
> Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> writes:
> 
>> Record/replay uses checkpoints to synchronize the execution
>> of the threads and timers. Hardware events such as BH are
>> processed at the checkpoints too.
>> Event processing can cause refreshing the virtual timers
>> and calling the icount-related functions, that also use checkpoints.
>> This patch prevents recursive processing of such checkpoints,
>> because they have their own records in the log and should be
>> processed later.
>>
>> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
>> ---
>>   replay/replay.c |   11 ++++++-----
>>   1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/replay/replay.c b/replay/replay.c
>> index c806fec69a..6df2abc18c 100644
>> --- a/replay/replay.c
>> +++ b/replay/replay.c
>> @@ -180,12 +180,13 @@ bool replay_checkpoint(ReplayCheckpoint checkpoint)
>>       }
>>   
>>       if (in_checkpoint) {
>> -        /* If we are already in checkpoint, then there is no need
>> -           for additional synchronization.
>> +        /*
>>              Recursion occurs when HW event modifies timers.
>> -           Timer modification may invoke the checkpoint and
>> -           proceed to recursion. */
>> -        return true;
>> +           Prevent performing icount warp in this case and
>> +           wait for another invocation of the checkpoint.
>> +        */
> 
> nit: as you are updating the comment you might as well fix the style. It
> would probably help with the diff as well.
> 
>> +        g_assert(replay_mode == REPLAY_MODE_PLAY);
>> +        return false;
>>       }
>>       in_checkpoint = true;
> 
> The accompanying comments in replay.h are also confusing
> 
>      Returns 0 in PLAY mode if checkpoint was not found.
>      Returns 1 in all other cases.
> 
> Which translated to actual bool results:
> 
>      Returns false in PLAY mode if checkpoint was not found
>      Returns true in all other cases
> 
> Which implies the checkpoint is always found (or created?) which I'm not
> even sure of while following the rest of the replay_checkpoint code
> which has exit cases of:
> 
>      bool res = false; (default)
>      replay_state.data_kind != EVENT_ASYNC;
>      res = true; (when recording)
> 
> So is the following more correct?
> 
> /**
>   * replay_checkpoint(checkpoint): save (in RECORD) or consume (in PLAY) checkpoint
>   * @checkpoint: the checkpoint event
>   *
>   * In SAVE mode stores the checkpoint in the record and potentially
>   * saves a number of events.
>   *
>   * In PLAY mode consumes checkpoint and any following EVENT_ASYNC events.
>   *
>   * Results: in SAVE mode always True
>   *          in PLAY mode True unless checkpoint not found or recursively called.
>   */
> 

Almost true.
In PLAY returns True only if the checkpoint was found and all following 
async events matched and processed.
Otherwise returns false and non-processed events are postponed to be 
consumed later.

Pavel Dovgalyuk