Some jobs upon finalization may need to perform some work that can
still fail. If these jobs are part of a transaction, it's important
that these callbacks fail the entire transaction.
We allow for a new callback in addition to commit/abort/clean that
allows us the opportunity to have fairly late-breaking failures
in the transactional process.
The expected flow is:
- All jobs in a transaction converge to the WAITING state
(added in a forthcoming commit)
- All jobs prepare to call either commit/abort
- If any job fails, is canceled, or fails preparation, all jobs
call their .abort callback.
- All jobs enter the PENDING state, awaiting manual intervention
(also added in a forthcoming commit)
- block-job-finalize is issued by the user/management layer
- All jobs call their commit callbacks.
Signed-off-by: John Snow <jsnow@redhat.com>
---
blockjob.c | 34 +++++++++++++++++++++++++++++++---
include/block/blockjob_int.h | 10 ++++++++++
2 files changed, 41 insertions(+), 3 deletions(-)
diff --git a/blockjob.c b/blockjob.c
index 8f02c03880..1c010ec100 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
}
}
+static int block_job_prepare(BlockJob *job)
+{
+ if (job->ret) {
+ goto out;
+ }
+ if (job->driver->prepare) {
+ job->ret = job->driver->prepare(job);
+ }
+ out:
+ return job->ret;
+}
+
static void block_job_commit(BlockJob *job)
{
assert(!job->ret);
@@ -417,7 +429,7 @@ static void block_job_clean(BlockJob *job)
}
}
-static void block_job_completed_single(BlockJob *job)
+static int block_job_completed_single(BlockJob *job)
{
assert(job->completed);
@@ -452,6 +464,7 @@ static void block_job_completed_single(BlockJob *job)
block_job_txn_unref(job->txn);
block_job_event_concluded(job);
block_job_unref(job);
+ return 0;
}
static void block_job_cancel_async(BlockJob *job)
@@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
job->cancelled = true;
}
-static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
+static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
{
AioContext *ctx;
BlockJob *job, *next;
+ int rc;
QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
ctx = blk_get_aio_context(job->blk);
aio_context_acquire(ctx);
- fn(job);
+ rc = fn(job);
aio_context_release(ctx);
+ if (rc) {
+ break;
+ }
}
+ return rc;
}
static void block_job_do_dismiss(BlockJob *job)
@@ -567,6 +585,8 @@ static void block_job_completed_txn_success(BlockJob *job)
{
BlockJobTxn *txn = job->txn;
BlockJob *other_job;
+ int rc = 0;
+
/*
* Successful completion, see if there are other running jobs in this
* txn.
@@ -576,6 +596,14 @@ static void block_job_completed_txn_success(BlockJob *job)
return;
}
}
+
+ /* Jobs may require some prep-work to complete without failure */
+ rc = block_job_txn_apply(txn, block_job_prepare);
+ if (rc) {
+ block_job_completed_txn_abort(job);
+ return;
+ }
+
/* We are the last completed job, commit the transaction. */
block_job_txn_apply(txn, block_job_completed_single);
}
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 259d49b32a..642adce68b 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -53,6 +53,16 @@ struct BlockJobDriver {
*/
void (*complete)(BlockJob *job, Error **errp);
+ /**
+ * If the callback is not NULL, prepare will be invoked when all the jobs
+ * belonging to the same transaction complete; or upon this job's completion
+ * if it is not in a transaction.
+ *
+ * This callback will not be invoked if the job has already failed.
+ * If it fails, abort and then clean will be called.
+ */
+ int (*prepare)(BlockJob *job);
+
/**
* If the callback is not NULL, it will be invoked when all the jobs
* belonging to the same transaction complete; or upon this job's
--
2.14.3
On 02/23/2018 05:51 PM, John Snow wrote:
> Some jobs upon finalization may need to perform some work that can
> still fail. If these jobs are part of a transaction, it's important
> that these callbacks fail the entire transaction.
>
> We allow for a new callback in addition to commit/abort/clean that
> allows us the opportunity to have fairly late-breaking failures
> in the transactional process.
>
> The expected flow is:
>
> - All jobs in a transaction converge to the WAITING state
> (added in a forthcoming commit)
> - All jobs prepare to call either commit/abort
> - If any job fails, is canceled, or fails preparation, all jobs
> call their .abort callback.
> - All jobs enter the PENDING state, awaiting manual intervention
> (also added in a forthcoming commit)
> - block-job-finalize is issued by the user/management layer
> - All jobs call their commit callbacks.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
> blockjob.c | 34 +++++++++++++++++++++++++++++++---
> include/block/blockjob_int.h | 10 ++++++++++
> 2 files changed, 41 insertions(+), 3 deletions(-)
>
> @@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
> job->cancelled = true;
> }
>
> -static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
> +static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
> {
> AioContext *ctx;
> BlockJob *job, *next;
> + int rc;
>
> QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
> ctx = blk_get_aio_context(job->blk);
> aio_context_acquire(ctx);
> - fn(job);
> + rc = fn(job);
> aio_context_release(ctx);
> + if (rc) {
> + break;
> + }
This short-circuits the application of the function to the rest of the
group. Is that ever going to be a problem?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
On 02/27/2018 02:56 PM, Eric Blake wrote:
> On 02/23/2018 05:51 PM, John Snow wrote:
>> Some jobs upon finalization may need to perform some work that can
>> still fail. If these jobs are part of a transaction, it's important
>> that these callbacks fail the entire transaction.
>>
>> We allow for a new callback in addition to commit/abort/clean that
>> allows us the opportunity to have fairly late-breaking failures
>> in the transactional process.
>>
>> The expected flow is:
>>
>> - All jobs in a transaction converge to the WAITING state
>> (added in a forthcoming commit)
>> - All jobs prepare to call either commit/abort
>> - If any job fails, is canceled, or fails preparation, all jobs
>> call their .abort callback.
>> - All jobs enter the PENDING state, awaiting manual intervention
>> (also added in a forthcoming commit)
>> - block-job-finalize is issued by the user/management layer
>> - All jobs call their commit callbacks.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>> blockjob.c | 34 +++++++++++++++++++++++++++++++---
>> include/block/blockjob_int.h | 10 ++++++++++
>> 2 files changed, 41 insertions(+), 3 deletions(-)
>>
>
>> @@ -467,17 +480,22 @@ static void block_job_cancel_async(BlockJob *job)
>> job->cancelled = true;
>> }
>> -static void block_job_txn_apply(BlockJobTxn *txn, void fn(BlockJob *))
>> +static int block_job_txn_apply(BlockJobTxn *txn, int fn(BlockJob *))
>> {
>> AioContext *ctx;
>> BlockJob *job, *next;
>> + int rc;
>> QLIST_FOREACH_SAFE(job, &txn->jobs, txn_list, next) {
>> ctx = blk_get_aio_context(job->blk);
>> aio_context_acquire(ctx);
>> - fn(job);
>> + rc = fn(job);
>> aio_context_release(ctx);
>> + if (rc) {
>> + break;
>> + }
>
> This short-circuits the application of the function to the rest of the
> group. Is that ever going to be a problem?
>
With what I've written, I don't think so -- but I can't guarantee
someone won't misunderstand the semantics of it and it will become a
problem. It is a potentially dangerous function in that way.
--js
Am 24.02.2018 um 00:51 hat John Snow geschrieben:
> Some jobs upon finalization may need to perform some work that can
> still fail. If these jobs are part of a transaction, it's important
> that these callbacks fail the entire transaction.
>
> We allow for a new callback in addition to commit/abort/clean that
> allows us the opportunity to have fairly late-breaking failures
> in the transactional process.
>
> The expected flow is:
>
> - All jobs in a transaction converge to the WAITING state
> (added in a forthcoming commit)
> - All jobs prepare to call either commit/abort
> - If any job fails, is canceled, or fails preparation, all jobs
> call their .abort callback.
> - All jobs enter the PENDING state, awaiting manual intervention
> (also added in a forthcoming commit)
> - block-job-finalize is issued by the user/management layer
> - All jobs call their commit callbacks.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
You almost made me believe the scary thought that we need transactional
graph modifications, but after writing half of the reply, I think it's
just that your order here is wrong.
So .prepare is the last thing in the whole process that is allowed to
fail. Graph manipulations such as bdrv_replace_node() can fail. Graph
manipulations can also only be made in response to block-job-finalize
because the management layer must be aware of them. Take them together
and you have a problem.
Didn't we already establish earlier that .prepare/.commit/.abort must be
called together and cannot be separated by waiting for a QMP command
because of locking and things?
So if you go to PENDING first, then wait for block-job-finalize and only
then call .prepare/.commit/.abort, we should be okay for both problems.
And taking a look at the final state, that seems to be what you do, so
in the end, it's probably just the commit message that needs a fix.
> blockjob.c | 34 +++++++++++++++++++++++++++++++---
> include/block/blockjob_int.h | 10 ++++++++++
> 2 files changed, 41 insertions(+), 3 deletions(-)
>
> diff --git a/blockjob.c b/blockjob.c
> index 8f02c03880..1c010ec100 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
> }
> }
>
> +static int block_job_prepare(BlockJob *job)
> +{
> + if (job->ret) {
> + goto out;
> + }
> + if (job->driver->prepare) {
> + job->ret = job->driver->prepare(job);
> + }
> + out:
> + return job->ret;
> +}
Why not just if (job->ret == 0 && job->driver->prepare) and save the
goto?
Kevin
On 02/28/2018 12:04 PM, Kevin Wolf wrote:
> Am 24.02.2018 um 00:51 hat John Snow geschrieben:
>> Some jobs upon finalization may need to perform some work that can
>> still fail. If these jobs are part of a transaction, it's important
>> that these callbacks fail the entire transaction.
>>
>> We allow for a new callback in addition to commit/abort/clean that
>> allows us the opportunity to have fairly late-breaking failures
>> in the transactional process.
>>
>> The expected flow is:
>>
>> - All jobs in a transaction converge to the WAITING state
>> (added in a forthcoming commit)
>> - All jobs prepare to call either commit/abort
>> - If any job fails, is canceled, or fails preparation, all jobs
>> call their .abort callback.
>> - All jobs enter the PENDING state, awaiting manual intervention
>> (also added in a forthcoming commit)
>> - block-job-finalize is issued by the user/management layer
>> - All jobs call their commit callbacks.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>
> You almost made me believe the scary thought that we need transactional
> graph modifications, but after writing half of the reply, I think it's
> just that your order here is wrong.
>
Sorry, yes, this blurb was outdated. I regret that it wasted your time.
> So .prepare is the last thing in the whole process that is allowed to
> fail. Graph manipulations such as bdrv_replace_node() can fail. Graph
> manipulations can also only be made in response to block-job-finalize
> because the management layer must be aware of them. Take them together
> and you have a problem.
>
> Didn't we already establish earlier that .prepare/.commit/.abort must be
> called together and cannot be separated by waiting for a QMP command
> because of locking and things?
>
Right; so what really happens is that in response to the FINALIZE verb,
the prepare loop is done first to check for success, and then commit or
abort are dispatched as appropriate.
> So if you go to PENDING first, then wait for block-job-finalize and only
> then call .prepare/.commit/.abort, we should be okay for both problems.
>
> And taking a look at the final state, that seems to be what you do, so
> in the end, it's probably just the commit message that needs a fix.
Yep, sorry.
>
>> blockjob.c | 34 +++++++++++++++++++++++++++++++---
>> include/block/blockjob_int.h | 10 ++++++++++
>> 2 files changed, 41 insertions(+), 3 deletions(-)
>>
>> diff --git a/blockjob.c b/blockjob.c
>> index 8f02c03880..1c010ec100 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -394,6 +394,18 @@ static void block_job_update_rc(BlockJob *job)
>> }
>> }
>>
>> +static int block_job_prepare(BlockJob *job)
>> +{
>> + if (job->ret) {
>> + goto out;
>> + }
>> + if (job->driver->prepare) {
>> + job->ret = job->driver->prepare(job);
>> + }
>> + out:
>> + return job->ret;
>> +}
>
> Why not just if (job->ret == 0 && job->driver->prepare) and save the
> goto?
>
Churn. ¯\_(ツ)_/¯
> Kevin
>
© 2016 - 2026 Red Hat, Inc.