From nobody Sat Feb 7 05:49:19 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; dkim=fail spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 14987435196391013.0733840820483; Thu, 29 Jun 2017 06:38:39 -0700 (PDT) Received: from localhost ([::1]:39413 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQZeW-0006rD-QQ for importer@patchew.org; Thu, 29 Jun 2017 09:38:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49065) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dQZUQ-0005x8-Rr for qemu-devel@nongnu.org; Thu, 29 Jun 2017 09:28:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dQZUL-0001Rw-AK for qemu-devel@nongnu.org; Thu, 29 Jun 2017 09:28:10 -0400 Received: from mail-wm0-x242.google.com ([2a00:1450:400c:c09::242]:36706) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dQZUK-0001RQ-Vc; Thu, 29 Jun 2017 09:28:05 -0400 Received: by mail-wm0-x242.google.com with SMTP id y5so2579208wmh.3; Thu, 29 Jun 2017 06:28:04 -0700 (PDT) Received: from localhost.localdomain (94-39-191-51.adsl-ull.clienti.tiscali.it. [94.39.191.51]) by smtp.gmail.com with ESMTPSA id i22sm4087691wrb.30.2017.06.29.06.28.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 29 Jun 2017 06:28:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=Cogfn2xgfJc1cS/EklL4IvoZhAk5aK7KdhG9CUQvOs8=; b=FU31hHUKTyucNcdDWM4/1aSrn/D6vuCD2tlAc9yoWSD5wm2nbbNBjgDUG5iocuyxH8 KezTRAOSqD5+WLe+kbJA2szSGSafb2toUMUOu0+7FyUv+8Hd2uJ1uPd1p+12UwJpGfrX g0Fenve6UF3PGgnU29/AJsoP7WwFDDOZB/bruSP1bifM3MINHbbkMjhHmj4nMXeqyWo4 oLOjrHrg9wK0UlYq0RsPdssgFG3BzyyC1gBCa6WJQYtA/ebCP3jPqi8uRqUNrO1kvo4i 8g4xMF1iilU7HjaBRd0UKdHACSkBgHCiymYWUF0K7IaL0CkWaO2u4b+PDzQOGcUuWhv9 p0mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=Cogfn2xgfJc1cS/EklL4IvoZhAk5aK7KdhG9CUQvOs8=; b=gpa+8KNRlm58dXmLwVLnwtikY3T3+Kwt9VhsQVzsR4Gn1GSOkpKZJs9e0m2pUKiP2H 2jb/ww7y5KEeHEdEAKYoBdWwil6+xOmvasEaA3DFhgvvSKKi8Yi5b/ThzijBrg7JWAhP 0QAq5NEpKHDuKENXkzPcNPb+Ow7nVS6M5RtsRQbrAJW7V1u2iXvhgrP00pVUJTMeHydX RmqfWVoZF86wknS1gLJ55/EvADVHGlnnBCyd3rNeoaO9OY0XKxJZabNL26UHZde/CbSI aDXewREwlZdnAE31s9ffEYoaSISnEKsWG3P3lRGLggTTrABopey9PxxlooOgvoQF3amO MbMA== X-Gm-Message-State: AKS2vOyXiyox2imDat2UOmmSVFEDjqSkfleHIFFX1MrmKQd05WhADK4c TxtpQlK6CRqXnL1Y/54= X-Received: by 10.28.169.5 with SMTP id s5mr2159287wme.90.1498742883185; Thu, 29 Jun 2017 06:28:03 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Thu, 29 Jun 2017 15:27:47 +0200 Message-Id: <20170629132749.997-10-pbonzini@redhat.com> X-Mailer: git-send-email 2.13.0 In-Reply-To: <20170629132749.997-1-pbonzini@redhat.com> References: <20170629132749.997-1-pbonzini@redhat.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c09::242 Subject: [Qemu-devel] [PATCH 09/11] qed: protect table cache with CoMutex X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, qemu-block@nongnu.org, stefanha@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This makes the driver thread-safe. The CoMutex is dropped temporarily while accessing the data clusters or the backing file. Signed-off-by: Paolo Bonzini --- RFC->v2: add bdrv_qed_invalidate_cache change; invalidate_cache can run in a coroutine when called by block migration. block/qed-cluster.c | 4 +- block/qed-l2-cache.c | 6 +++ block/qed-table.c | 24 +++++++--- block/qed.c | 133 +++++++++++++++++++++++++++++++++++------------= ---- block/qed.h | 11 +++-- 5 files changed, 124 insertions(+), 54 deletions(-) diff --git a/block/qed-cluster.c b/block/qed-cluster.c index d8d6e66a0f..672e2e654b 100644 --- a/block/qed-cluster.c +++ b/block/qed-cluster.c @@ -85,6 +85,8 @@ static unsigned int qed_count_contiguous_clusters(BDRVQED= State *s, * * On failure QED_CLUSTER_L2 or QED_CLUSTER_L1 is returned for missing L2 = or L1 * table offset, respectively. len is number of contiguous unallocated byt= es. + * + * Called with table_lock held. */ int coroutine_fn qed_find_cluster(BDRVQEDState *s, QEDRequest *request, uint64_t pos, size_t *len, @@ -112,7 +114,6 @@ int coroutine_fn qed_find_cluster(BDRVQEDState *s, QEDR= equest *request, } =20 ret =3D qed_read_l2_table(s, request, l2_offset); - qed_acquire(s); if (ret) { goto out; } @@ -137,6 +138,5 @@ int coroutine_fn qed_find_cluster(BDRVQEDState *s, QEDR= equest *request, =20 out: *img_offset =3D offset; - qed_release(s); return ret; } diff --git a/block/qed-l2-cache.c b/block/qed-l2-cache.c index 5cba794650..b548362398 100644 --- a/block/qed-l2-cache.c +++ b/block/qed-l2-cache.c @@ -101,6 +101,8 @@ CachedL2Table *qed_alloc_l2_cache_entry(L2TableCache *l= 2_cache) /** * Decrease an entry's reference count and free if necessary when the refe= rence * count drops to zero. + * + * Called with table_lock held. */ void qed_unref_l2_cache_entry(CachedL2Table *entry) { @@ -122,6 +124,8 @@ void qed_unref_l2_cache_entry(CachedL2Table *entry) * * For a cached entry, this function increases the reference count and ret= urns * the entry. + * + * Called with table_lock held. */ CachedL2Table *qed_find_l2_cache_entry(L2TableCache *l2_cache, uint64_t of= fset) { @@ -150,6 +154,8 @@ CachedL2Table *qed_find_l2_cache_entry(L2TableCache *l2= _cache, uint64_t offset) * N.B. This function steals a reference to the l2_table from the caller s= o the * caller must obtain a new reference by issuing a call to * qed_find_l2_cache_entry(). + * + * Called with table_lock held. */ void qed_commit_l2_cache_entry(L2TableCache *l2_cache, CachedL2Table *l2_t= able) { diff --git a/block/qed-table.c b/block/qed-table.c index ebee2c50f0..eead8b0fc7 100644 --- a/block/qed-table.c +++ b/block/qed-table.c @@ -18,6 +18,7 @@ #include "qed.h" #include "qemu/bswap.h" =20 +/* Called either from qed_check or with table_lock held. */ static int qed_read_table(BDRVQEDState *s, uint64_t offset, QEDTable *tabl= e) { QEMUIOVector qiov; @@ -32,18 +33,22 @@ static int qed_read_table(BDRVQEDState *s, uint64_t off= set, QEDTable *table) =20 trace_qed_read_table(s, offset, table); =20 + if (qemu_in_coroutine()) { + qemu_co_mutex_unlock(&s->table_lock); + } ret =3D bdrv_preadv(s->bs->file, offset, &qiov); + if (qemu_in_coroutine()) { + qemu_co_mutex_lock(&s->table_lock); + } if (ret < 0) { goto out; } =20 /* Byteswap offsets */ - qed_acquire(s); noffsets =3D qiov.size / sizeof(uint64_t); for (i =3D 0; i < noffsets; i++) { table->offsets[i] =3D le64_to_cpu(table->offsets[i]); } - qed_release(s); =20 ret =3D 0; out: @@ -61,6 +66,8 @@ out: * @index: Index of first element * @n: Number of elements * @flush: Whether or not to sync to disk + * + * Called either from qed_check or with table_lock held. */ static int qed_write_table(BDRVQEDState *s, uint64_t offset, QEDTable *tab= le, unsigned int index, unsigned int n, bool flush) @@ -97,16 +104,20 @@ static int qed_write_table(BDRVQEDState *s, uint64_t o= ffset, QEDTable *table, /* Adjust for offset into table */ offset +=3D start * sizeof(uint64_t); =20 + if (qemu_in_coroutine()) { + qemu_co_mutex_unlock(&s->table_lock); + } ret =3D bdrv_pwritev(s->bs->file, offset, &qiov); + if (qemu_in_coroutine()) { + qemu_co_mutex_lock(&s->table_lock); + } trace_qed_write_table_cb(s, table, flush, ret); if (ret < 0) { goto out; } =20 if (flush) { - qed_acquire(s); ret =3D bdrv_flush(s->bs); - qed_release(s); if (ret < 0) { goto out; } @@ -123,6 +134,7 @@ int qed_read_l1_table_sync(BDRVQEDState *s) return qed_read_table(s, s->header.l1_table_offset, s->l1_table); } =20 +/* Called either from qed_check or with table_lock held. */ int qed_write_l1_table(BDRVQEDState *s, unsigned int index, unsigned int n) { BLKDBG_EVENT(s->bs->file, BLKDBG_L1_UPDATE); @@ -136,6 +148,7 @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned i= nt index, return qed_write_l1_table(s, index, n); } =20 +/* Called either from qed_check or with table_lock held. */ int qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offse= t) { int ret; @@ -154,7 +167,6 @@ int qed_read_l2_table(BDRVQEDState *s, QEDRequest *requ= est, uint64_t offset) BLKDBG_EVENT(s->bs->file, BLKDBG_L2_LOAD); ret =3D qed_read_table(s, offset, request->l2_table->table); =20 - qed_acquire(s); if (ret) { /* can't trust loaded L2 table anymore */ qed_unref_l2_cache_entry(request->l2_table); @@ -170,7 +182,6 @@ int qed_read_l2_table(BDRVQEDState *s, QEDRequest *requ= est, uint64_t offset) request->l2_table =3D qed_find_l2_cache_entry(&s->l2_cache, offset= ); assert(request->l2_table !=3D NULL); } - qed_release(s); =20 return ret; } @@ -180,6 +191,7 @@ int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest = *request, uint64_t offset return qed_read_l2_table(s, request, offset); } =20 +/* Called either from qed_check or with table_lock held. */ int qed_write_l2_table(BDRVQEDState *s, QEDRequest *request, unsigned int index, unsigned int n, bool flush) { diff --git a/block/qed.c b/block/qed.c index 8228a50f68..afad14fc01 100644 --- a/block/qed.c +++ b/block/qed.c @@ -93,6 +93,8 @@ int qed_write_header_sync(BDRVQEDState *s) * * This function only updates known header fields in-place and does not af= fect * extra data after the QED header. + * + * No new allocating reqs can start while this function runs. */ static int coroutine_fn qed_write_header(BDRVQEDState *s) { @@ -109,6 +111,8 @@ static int coroutine_fn qed_write_header(BDRVQEDState *= s) QEMUIOVector qiov; int ret; =20 + assert(s->allocating_acb || s->allocating_write_reqs_plugged); + buf =3D qemu_blockalign(s->bs, len); iov =3D (struct iovec) { .iov_base =3D buf, @@ -219,6 +223,8 @@ static int qed_read_string(BdrvChild *file, uint64_t of= fset, size_t n, * This function only produces the offset where the new clusters should be * written. It updates BDRVQEDState but does not make any changes to the = image * file. + * + * Called with table_lock held. */ static uint64_t qed_alloc_clusters(BDRVQEDState *s, unsigned int n) { @@ -236,6 +242,8 @@ QEDTable *qed_alloc_table(BDRVQEDState *s) =20 /** * Allocate a new zeroed L2 table + * + * Called with table_lock held. */ static CachedL2Table *qed_new_l2_table(BDRVQEDState *s) { @@ -249,19 +257,32 @@ static CachedL2Table *qed_new_l2_table(BDRVQEDState *= s) return l2_table; } =20 -static void qed_plug_allocating_write_reqs(BDRVQEDState *s) +static bool qed_plug_allocating_write_reqs(BDRVQEDState *s) { + qemu_co_mutex_lock(&s->table_lock); + + /* No reentrancy is allowed. */ assert(!s->allocating_write_reqs_plugged); + if (s->allocating_acb !=3D NULL) { + /* Another allocating write came concurrently. This cannot happen + * from bdrv_qed_co_drain, but it can happen when the timer runs. + */ + qemu_co_mutex_unlock(&s->table_lock); + return false; + } =20 s->allocating_write_reqs_plugged =3D true; + qemu_co_mutex_unlock(&s->table_lock); + return true; } =20 static void qed_unplug_allocating_write_reqs(BDRVQEDState *s) { + qemu_co_mutex_lock(&s->table_lock); assert(s->allocating_write_reqs_plugged); - s->allocating_write_reqs_plugged =3D false; - qemu_co_enter_next(&s->allocating_write_reqs); + qemu_co_queue_next(&s->allocating_write_reqs); + qemu_co_mutex_unlock(&s->table_lock); } =20 static void coroutine_fn qed_need_check_timer_entry(void *opaque) @@ -269,17 +290,14 @@ static void coroutine_fn qed_need_check_timer_entry(v= oid *opaque) BDRVQEDState *s =3D opaque; int ret; =20 - /* The timer should only fire when allocating writes have drained */ - assert(!s->allocating_acb); - trace_qed_need_check_timer_cb(s); =20 - qed_acquire(s); - qed_plug_allocating_write_reqs(s); + if (!qed_plug_allocating_write_reqs(s)) { + return; + } =20 /* Ensure writes are on disk before clearing flag */ ret =3D bdrv_co_flush(s->bs->file->bs); - qed_release(s); if (ret < 0) { qed_unplug_allocating_write_reqs(s); return; @@ -301,16 +319,6 @@ static void qed_need_check_timer_cb(void *opaque) qemu_coroutine_enter(co); } =20 -void qed_acquire(BDRVQEDState *s) -{ - aio_context_acquire(bdrv_get_aio_context(s->bs)); -} - -void qed_release(BDRVQEDState *s) -{ - aio_context_release(bdrv_get_aio_context(s->bs)); -} - static void qed_start_need_check_timer(BDRVQEDState *s) { trace_qed_start_need_check_timer(s); @@ -369,6 +377,7 @@ static void bdrv_qed_init_state(BlockDriverState *bs) =20 memset(s, 0, sizeof(BDRVQEDState)); s->bs =3D bs; + qemu_co_mutex_init(&s->table_lock); qemu_co_queue_init(&s->allocating_write_reqs); } =20 @@ -688,6 +697,7 @@ typedef struct { BlockDriverState **file; } QEDIsAllocatedCB; =20 +/* Called with table_lock held. */ static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, si= ze_t len) { QEDIsAllocatedCB *cb =3D opaque; @@ -735,6 +745,7 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_statu= s(BlockDriverState *bs, uint64_t offset; int ret; =20 + qemu_co_mutex_lock(&s->table_lock); ret =3D qed_find_cluster(s, &request, cb.pos, &len, &offset); qed_is_allocated_cb(&cb, ret, offset, len); =20 @@ -742,6 +753,7 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_statu= s(BlockDriverState *bs, assert(cb.status !=3D BDRV_BLOCK_OFFSET_MASK); =20 qed_unref_l2_cache_entry(request.l2_table); + qemu_co_mutex_unlock(&s->table_lock); =20 return cb.status; } @@ -872,6 +884,8 @@ out: * * The cluster offset may be an allocated byte offset in the image file, t= he * zero cluster marker, or the unallocated cluster marker. + * + * Called with table_lock held. */ static void coroutine_fn qed_update_l2_table(BDRVQEDState *s, QEDTable *ta= ble, int index, unsigned int n, @@ -887,6 +901,7 @@ static void coroutine_fn qed_update_l2_table(BDRVQEDSta= te *s, QEDTable *table, } } =20 +/* Called with table_lock held. */ static void coroutine_fn qed_aio_complete(QEDAIOCB *acb) { BDRVQEDState *s =3D acb_to_s(acb); @@ -910,7 +925,7 @@ static void coroutine_fn qed_aio_complete(QEDAIOCB *acb) if (acb =3D=3D s->allocating_acb) { s->allocating_acb =3D NULL; if (!qemu_co_queue_empty(&s->allocating_write_reqs)) { - qemu_co_enter_next(&s->allocating_write_reqs); + qemu_co_queue_next(&s->allocating_write_reqs); } else if (s->header.features & QED_F_NEED_CHECK) { qed_start_need_check_timer(s); } @@ -919,6 +934,8 @@ static void coroutine_fn qed_aio_complete(QEDAIOCB *acb) =20 /** * Update L1 table with new L2 table offset and write it out + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_write_l1_update(QEDAIOCB *acb) { @@ -947,6 +964,8 @@ static int coroutine_fn qed_aio_write_l1_update(QEDAIOC= B *acb) =20 /** * Update L2 table with new cluster offsets and write them out + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_write_l2_update(QEDAIOCB *acb, uint64_t of= fset) { @@ -983,6 +1002,8 @@ static int coroutine_fn qed_aio_write_l2_update(QEDAIO= CB *acb, uint64_t offset) =20 /** * Write data to the image file + * + * Called with table_lock *not* held. */ static int coroutine_fn qed_aio_write_main(QEDAIOCB *acb) { @@ -999,6 +1020,8 @@ static int coroutine_fn qed_aio_write_main(QEDAIOCB *a= cb) =20 /** * Populate untouched regions of new data cluster + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_write_cow(QEDAIOCB *acb) { @@ -1006,6 +1029,8 @@ static int coroutine_fn qed_aio_write_cow(QEDAIOCB *a= cb) uint64_t start, len, offset; int ret; =20 + qemu_co_mutex_unlock(&s->table_lock); + /* Populate front untouched region of new data cluster */ start =3D qed_start_of_cluster(s, acb->cur_pos); len =3D qed_offset_into_cluster(s, acb->cur_pos); @@ -1013,7 +1038,7 @@ static int coroutine_fn qed_aio_write_cow(QEDAIOCB *a= cb) trace_qed_aio_write_prefill(s, acb, start, len, acb->cur_cluster); ret =3D qed_copy_from_backing_file(s, start, len, acb->cur_cluster); if (ret < 0) { - return ret; + goto out; } =20 /* Populate back untouched region of new data cluster */ @@ -1026,12 +1051,12 @@ static int coroutine_fn qed_aio_write_cow(QEDAIOCB = *acb) trace_qed_aio_write_postfill(s, acb, start, len, offset); ret =3D qed_copy_from_backing_file(s, start, len, offset); if (ret < 0) { - return ret; + goto out; } =20 ret =3D qed_aio_write_main(acb); if (ret < 0) { - return ret; + goto out; } =20 if (s->bs->backing) { @@ -1046,12 +1071,11 @@ static int coroutine_fn qed_aio_write_cow(QEDAIOCB = *acb) * cluster and before updating the L2 table. */ ret =3D bdrv_co_flush(s->bs->file->bs); - if (ret < 0) { - return ret; - } } =20 - return 0; +out: + qemu_co_mutex_lock(&s->table_lock); + return ret; } =20 /** @@ -1074,6 +1098,8 @@ static bool qed_should_set_need_check(BDRVQEDState *s) * @len: Length in bytes * * This path is taken when writing to previously unallocated clusters. + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_write_alloc(QEDAIOCB *acb, size_t len) { @@ -1088,7 +1114,7 @@ static int coroutine_fn qed_aio_write_alloc(QEDAIOCB = *acb, size_t len) /* Freeze this request if another allocating write is in progress */ if (s->allocating_acb !=3D acb || s->allocating_write_reqs_plugged) { if (s->allocating_acb !=3D NULL) { - qemu_co_queue_wait(&s->allocating_write_reqs, NULL); + qemu_co_queue_wait(&s->allocating_write_reqs, &s->table_lock); assert(s->allocating_acb =3D=3D NULL); } s->allocating_acb =3D acb; @@ -1135,10 +1161,17 @@ static int coroutine_fn qed_aio_write_alloc(QEDAIOC= B *acb, size_t len) * @len: Length in bytes * * This path is taken when writing to already allocated clusters. + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_write_inplace(QEDAIOCB *acb, uint64_t offs= et, size_t len) { + BDRVQEDState *s =3D acb_to_s(acb); + int r; + + qemu_co_mutex_unlock(&s->table_lock); + /* Allocate buffer for zero writes */ if (acb->flags & QED_AIOCB_ZERO) { struct iovec *iov =3D acb->qiov->iov; @@ -1146,7 +1179,8 @@ static int coroutine_fn qed_aio_write_inplace(QEDAIOC= B *acb, uint64_t offset, if (!iov->iov_base) { iov->iov_base =3D qemu_try_blockalign(acb->bs, iov->iov_len); if (iov->iov_base =3D=3D NULL) { - return -ENOMEM; + r =3D -ENOMEM; + goto out; } memset(iov->iov_base, 0, iov->iov_len); } @@ -1156,8 +1190,11 @@ static int coroutine_fn qed_aio_write_inplace(QEDAIO= CB *acb, uint64_t offset, acb->cur_cluster =3D offset; qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len); =20 - /* Do the actual write */ - return qed_aio_write_main(acb); + /* Do the actual write. */ + r =3D qed_aio_write_main(acb); +out: + qemu_co_mutex_lock(&s->table_lock); + return r; } =20 /** @@ -1167,6 +1204,8 @@ static int coroutine_fn qed_aio_write_inplace(QEDAIOC= B *acb, uint64_t offset, * @ret: QED_CLUSTER_FOUND, QED_CLUSTER_L2 or QED_CLUSTER_L1 * @offset: Cluster offset in bytes * @len: Length in bytes + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_write_data(void *opaque, int ret, uint64_t offset, size_t len) @@ -1198,6 +1237,8 @@ static int coroutine_fn qed_aio_write_data(void *opaq= ue, int ret, * @ret: QED_CLUSTER_FOUND, QED_CLUSTER_L2 or QED_CLUSTER_L1 * @offset: Cluster offset in bytes * @len: Length in bytes + * + * Called with table_lock held. */ static int coroutine_fn qed_aio_read_data(void *opaque, int ret, uint64_t offset, size_t len) @@ -1205,6 +1246,9 @@ static int coroutine_fn qed_aio_read_data(void *opaqu= e, int ret, QEDAIOCB *acb =3D opaque; BDRVQEDState *s =3D acb_to_s(acb); BlockDriverState *bs =3D acb->bs; + int r; + + qemu_co_mutex_unlock(&s->table_lock); =20 /* Adjust offset into cluster */ offset +=3D qed_offset_into_cluster(s, acb->cur_pos); @@ -1213,22 +1257,23 @@ static int coroutine_fn qed_aio_read_data(void *opa= que, int ret, =20 qemu_iovec_concat(&acb->cur_qiov, acb->qiov, acb->qiov_offset, len); =20 - /* Handle zero cluster and backing file reads */ + /* Handle zero cluster and backing file reads, otherwise read + * data cluster directly. + */ if (ret =3D=3D QED_CLUSTER_ZERO) { qemu_iovec_memset(&acb->cur_qiov, 0, 0, acb->cur_qiov.size); - return 0; + r =3D 0; } else if (ret !=3D QED_CLUSTER_FOUND) { - return qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov, - &acb->backing_qiov); + r =3D qed_read_backing_file(s, acb->cur_pos, &acb->cur_qiov, + &acb->backing_qiov); + } else { + BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO); + r =3D bdrv_co_preadv(bs->file, offset, acb->cur_qiov.size, + &acb->cur_qiov, 0); } =20 - BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO); - ret =3D bdrv_co_preadv(bs->file, offset, acb->cur_qiov.size, - &acb->cur_qiov, 0); - if (ret < 0) { - return ret; - } - return 0; + qemu_co_mutex_lock(&s->table_lock); + return r; } =20 /** @@ -1241,6 +1286,7 @@ static int coroutine_fn qed_aio_next_io(QEDAIOCB *acb) size_t len; int ret; =20 + qemu_co_mutex_lock(&s->table_lock); while (1) { trace_qed_aio_next_io(s, acb, 0, acb->cur_pos + acb->cur_qiov.size= ); =20 @@ -1280,6 +1326,7 @@ static int coroutine_fn qed_aio_next_io(QEDAIOCB *acb) =20 trace_qed_aio_complete(s, acb, ret); qed_aio_complete(acb); + qemu_co_mutex_unlock(&s->table_lock); return ret; } =20 @@ -1469,7 +1516,13 @@ static void bdrv_qed_invalidate_cache(BlockDriverSta= te *bs, Error **errp) bdrv_qed_close(bs); =20 bdrv_qed_init_state(bs); + if (qemu_in_coroutine()) { + qemu_co_mutex_lock(&s->table_lock); + } ret =3D bdrv_qed_do_open(bs, NULL, bs->open_flags, &local_err); + if (qemu_in_coroutine()) { + qemu_co_mutex_unlock(&s->table_lock); + } if (local_err) { error_propagate(errp, local_err); error_prepend(errp, "Could not reopen qed layer: "); diff --git a/block/qed.h b/block/qed.h index dd3a2d5519..f35341f134 100644 --- a/block/qed.h +++ b/block/qed.h @@ -151,15 +151,21 @@ typedef struct QEDAIOCB { =20 typedef struct { BlockDriverState *bs; /* device */ - uint64_t file_size; /* length of image file, in bytes */ =20 + /* Written only by an allocating write or the timer handler (the latter + * while allocating reqs are plugged). + */ QEDHeader header; /* always cpu-endian */ + + /* Protected by table_lock. */ + CoMutex table_lock; QEDTable *l1_table; L2TableCache l2_cache; /* l2 table cache */ uint32_t table_nelems; uint32_t l1_shift; uint32_t l2_shift; uint32_t l2_mask; + uint64_t file_size; /* length of image file, in bytes */ =20 /* Allocating write request queue */ QEDAIOCB *allocating_acb; @@ -177,9 +183,6 @@ enum { QED_CLUSTER_L1, /* cluster missing in L1 */ }; =20 -void qed_acquire(BDRVQEDState *s); -void qed_release(BDRVQEDState *s); - /** * Header functions */ --=20 2.13.0