1
The following changes since commit 8507c9d5c9a62de2a0e281b640f995e26eac46af:
1
The following changes since commit 6c769690ac845fa62642a5f93b4e4bd906adab95:
2
2
3
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2020-11-03 15:59:44 +0000)
3
Merge remote-tracking branch 'remotes/vsementsov/tags/pull-simplebench-2021-05-04' into staging (2021-05-21 12:02:34 +0100)
4
4
5
are available in the Git repository at:
5
are available in the Git repository at:
6
6
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
7
https://gitlab.com/stefanha/qemu.git tags/block-pull-request
8
8
9
for you to fetch changes up to fc107d86840b3364e922c26cf7631b7fd38ce523:
9
for you to fetch changes up to 0a6f0c76a030710780ce10d6347a70f098024d21:
10
10
11
util/vfio-helpers: Assert offset is aligned to page size (2020-11-03 19:06:23 +0000)
11
coroutine-sleep: introduce qemu_co_sleep (2021-05-21 18:22:33 +0100)
12
12
13
----------------------------------------------------------------
13
----------------------------------------------------------------
14
Pull request for 5.2
14
Pull request
15
15
16
NVMe fixes to solve IOMMU issues on non-x86 and error message/tracing
16
(Resent due to an email preparation mistake.)
17
improvements. Elena Afanasova's ioeventfd fixes are also included.
18
19
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
20
17
21
----------------------------------------------------------------
18
----------------------------------------------------------------
22
19
23
Elena Afanasova (2):
20
Paolo Bonzini (6):
24
accel/kvm: add PIO ioeventfds only in case kvm_eventfds_allowed is
21
coroutine-sleep: use a stack-allocated timer
25
true
22
coroutine-sleep: disallow NULL QemuCoSleepState** argument
26
softmmu/memory: fix memory_region_ioeventfd_equal()
23
coroutine-sleep: allow qemu_co_sleep_wake that wakes nothing
24
coroutine-sleep: move timer out of QemuCoSleepState
25
coroutine-sleep: replace QemuCoSleepState pointer with struct in the
26
API
27
coroutine-sleep: introduce qemu_co_sleep
27
28
28
Eric Auger (4):
29
Philippe Mathieu-Daudé (1):
29
block/nvme: Change size and alignment of IDENTIFY response buffer
30
bitops.h: Improve find_xxx_bit() documentation
30
block/nvme: Change size and alignment of queue
31
block/nvme: Change size and alignment of prp_list_pages
32
block/nvme: Align iov's va and size on host page size
33
31
34
Philippe Mathieu-Daudé (27):
32
Zenghui Yu (1):
35
MAINTAINERS: Cover "block/nvme.h" file
33
multi-process: Initialize variables declared with g_auto*
36
block/nvme: Use hex format to display offset in trace events
37
block/nvme: Report warning with warn_report()
38
block/nvme: Trace controller capabilities
39
block/nvme: Trace nvme_poll_queue() per queue
40
block/nvme: Improve nvme_free_req_queue_wait() trace information
41
block/nvme: Trace queue pair creation/deletion
42
block/nvme: Move definitions before structure declarations
43
block/nvme: Use unsigned integer for queue counter/size
44
block/nvme: Make nvme_identify() return boolean indicating error
45
block/nvme: Make nvme_init_queue() return boolean indicating error
46
block/nvme: Introduce Completion Queue definitions
47
block/nvme: Use definitions instead of magic values in add_io_queue()
48
block/nvme: Correctly initialize Admin Queue Attributes
49
block/nvme: Simplify ADMIN queue access
50
block/nvme: Simplify nvme_cmd_sync()
51
block/nvme: Set request_alignment at initialization
52
block/nvme: Correct minimum device page size
53
block/nvme: Fix use of write-only doorbells page on Aarch64 arch
54
block/nvme: Fix nvme_submit_command() on big-endian host
55
util/vfio-helpers: Improve reporting unsupported IOMMU type
56
util/vfio-helpers: Trace PCI I/O config accesses
57
util/vfio-helpers: Trace PCI BAR region info
58
util/vfio-helpers: Trace where BARs are mapped
59
util/vfio-helpers: Improve DMA trace events
60
util/vfio-helpers: Convert vfio_dump_mapping to trace events
61
util/vfio-helpers: Assert offset is aligned to page size
62
34
63
MAINTAINERS | 2 +
35
include/qemu/bitops.h | 15 ++++++--
64
include/block/nvme.h | 18 ++--
36
include/qemu/coroutine.h | 27 ++++++++-----
65
accel/kvm/kvm-all.c | 6 +-
37
block/block-copy.c | 10 ++---
66
block/nvme.c | 209 ++++++++++++++++++++++++-------------------
38
block/nbd.c | 14 +++----
67
softmmu/memory.c | 11 ++-
39
hw/remote/memory.c | 5 +--
68
util/vfio-helpers.c | 43 +++++----
40
hw/remote/proxy.c | 3 +-
69
block/trace-events | 30 ++++---
41
util/qemu-coroutine-sleep.c | 75 +++++++++++++++++++------------------
70
util/trace-events | 10 ++-
42
7 files changed, 79 insertions(+), 70 deletions(-)
71
8 files changed, 195 insertions(+), 134 deletions(-)
72
43
73
--
44
--
74
2.28.0
45
2.31.1
75
46
diff view generated by jsdifflib
1
From: Elena Afanasova <eafanasova@gmail.com>
1
From: Zenghui Yu <yuzenghui@huawei.com>
2
2
3
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
3
Quote docs/devel/style.rst (section "Automatic memory deallocation"):
4
Signed-off-by: Elena Afanasova <eafanasova@gmail.com>
4
5
Message-Id: <20201017210102.26036-1-eafanasova@gmail.com>
5
* Variables declared with g_auto* MUST always be initialized,
6
otherwise the cleanup function will use uninitialized stack memory
7
8
Initialize @name properly to get rid of the compilation error (using
9
gcc-7.3.0 on CentOS):
10
11
../hw/remote/proxy.c: In function 'pci_proxy_dev_realize':
12
/usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: 'name' may be used uninitialized in this function [-Werror=maybe-uninitialized]
13
g_free (*pp);
14
^~~~~~~~~~~~
15
../hw/remote/proxy.c:350:30: note: 'name' was declared here
16
g_autofree char *name;
17
^~~~
18
19
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
20
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
21
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
22
Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
23
Message-id: 20210312112143.1369-1-yuzenghui@huawei.com
6
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
24
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
7
---
25
---
8
accel/kvm/kvm-all.c | 6 ++++--
26
hw/remote/memory.c | 5 ++---
9
1 file changed, 4 insertions(+), 2 deletions(-)
27
hw/remote/proxy.c | 3 +--
28
2 files changed, 3 insertions(+), 5 deletions(-)
10
29
11
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
30
diff --git a/hw/remote/memory.c b/hw/remote/memory.c
12
index XXXXXXX..XXXXXXX 100644
31
index XXXXXXX..XXXXXXX 100644
13
--- a/accel/kvm/kvm-all.c
32
--- a/hw/remote/memory.c
14
+++ b/accel/kvm/kvm-all.c
33
+++ b/hw/remote/memory.c
15
@@ -XXX,XX +XXX,XX @@ static int kvm_init(MachineState *ms)
34
@@ -XXX,XX +XXX,XX @@ void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp)
16
35
17
kvm_memory_listener_register(s, &s->memory_listener,
36
remote_sysmem_reset();
18
&address_space_memory, 0);
37
19
- memory_listener_register(&kvm_io_listener,
38
- for (region = 0; region < msg->num_fds; region++) {
20
- &address_space_io);
39
- g_autofree char *name;
21
+ if (kvm_eventfds_allowed) {
40
+ for (region = 0; region < msg->num_fds; region++, suffix++) {
22
+ memory_listener_register(&kvm_io_listener,
41
+ g_autofree char *name = g_strdup_printf("remote-mem-%u", suffix);
23
+ &address_space_io);
42
subregion = g_new(MemoryRegion, 1);
24
+ }
43
- name = g_strdup_printf("remote-mem-%u", suffix++);
25
memory_listener_register(&kvm_coalesced_pio_listener,
44
memory_region_init_ram_from_fd(subregion, NULL,
26
&address_space_io);
45
name, sysmem_info->sizes[region],
27
46
true, msg->fds[region],
47
diff --git a/hw/remote/proxy.c b/hw/remote/proxy.c
48
index XXXXXXX..XXXXXXX 100644
49
--- a/hw/remote/proxy.c
50
+++ b/hw/remote/proxy.c
51
@@ -XXX,XX +XXX,XX @@ static void probe_pci_info(PCIDevice *dev, Error **errp)
52
PCI_BASE_ADDRESS_SPACE_IO : PCI_BASE_ADDRESS_SPACE_MEMORY;
53
54
if (size) {
55
- g_autofree char *name;
56
+ g_autofree char *name = g_strdup_printf("bar-region-%d", i);
57
pdev->region[i].dev = pdev;
58
pdev->region[i].present = true;
59
if (type == PCI_BASE_ADDRESS_SPACE_MEMORY) {
60
pdev->region[i].memory = true;
61
}
62
- name = g_strdup_printf("bar-region-%d", i);
63
memory_region_init_io(&pdev->region[i].mr, OBJECT(pdev),
64
&proxy_mr_ops, &pdev->region[i],
65
name, size);
28
--
66
--
29
2.28.0
67
2.31.1
30
68
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
2
3
The Completion Queue Command Identifier is a 16-bit value,
3
Document the following functions return the bitmap size
4
so nvme_submit_command() is unlikely to work on big-endian
4
if no matching bit is found:
5
hosts, as the relevant bits are truncated.
6
Fix by using the correct byte-swap function.
7
5
8
Fixes: bdd6a90a9e5 ("block: Add VFIO based NVMe driver")
6
- find_first_bit
9
Reported-by: Keith Busch <kbusch@kernel.org>
7
- find_next_bit
8
- find_last_bit
9
- find_first_zero_bit
10
- find_next_zero_bit
11
12
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
13
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Message-id: 20201029093306.1063879-25-philmd@redhat.com
15
Message-id: 20210510200758.2623154-2-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
17
---
16
block/nvme.c | 2 +-
18
include/qemu/bitops.h | 15 ++++++++++++---
17
1 file changed, 1 insertion(+), 1 deletion(-)
19
1 file changed, 12 insertions(+), 3 deletions(-)
18
20
19
diff --git a/block/nvme.c b/block/nvme.c
21
diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
20
index XXXXXXX..XXXXXXX 100644
22
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
23
--- a/include/qemu/bitops.h
22
+++ b/block/nvme.c
24
+++ b/include/qemu/bitops.h
23
@@ -XXX,XX +XXX,XX @@ static void nvme_submit_command(NVMeQueuePair *q, NVMeRequest *req,
25
@@ -XXX,XX +XXX,XX @@ static inline int test_bit(long nr, const unsigned long *addr)
24
assert(!req->cb);
26
* @addr: The address to start the search at
25
req->cb = cb;
27
* @size: The maximum size to search
26
req->opaque = opaque;
28
*
27
- cmd->cid = cpu_to_le32(req->cid);
29
- * Returns the bit number of the first set bit, or size.
28
+ cmd->cid = cpu_to_le16(req->cid);
30
+ * Returns the bit number of the last set bit,
29
31
+ * or @size if there is no set bit in the bitmap.
30
trace_nvme_submit_command(q->s, q->index, req->cid);
32
*/
31
nvme_trace_command(cmd);
33
unsigned long find_last_bit(const unsigned long *addr,
34
unsigned long size);
35
@@ -XXX,XX +XXX,XX @@ unsigned long find_last_bit(const unsigned long *addr,
36
* @addr: The address to base the search on
37
* @offset: The bitnumber to start searching at
38
* @size: The bitmap size in bits
39
+ *
40
+ * Returns the bit number of the next set bit,
41
+ * or @size if there are no further set bits in the bitmap.
42
*/
43
unsigned long find_next_bit(const unsigned long *addr,
44
unsigned long size,
45
@@ -XXX,XX +XXX,XX @@ unsigned long find_next_bit(const unsigned long *addr,
46
* @addr: The address to base the search on
47
* @offset: The bitnumber to start searching at
48
* @size: The bitmap size in bits
49
+ *
50
+ * Returns the bit number of the next cleared bit,
51
+ * or @size if there are no further clear bits in the bitmap.
52
*/
53
54
unsigned long find_next_zero_bit(const unsigned long *addr,
55
@@ -XXX,XX +XXX,XX @@ unsigned long find_next_zero_bit(const unsigned long *addr,
56
* @addr: The address to start the search at
57
* @size: The maximum size to search
58
*
59
- * Returns the bit number of the first set bit.
60
+ * Returns the bit number of the first set bit,
61
+ * or @size if there is no set bit in the bitmap.
62
*/
63
static inline unsigned long find_first_bit(const unsigned long *addr,
64
unsigned long size)
65
@@ -XXX,XX +XXX,XX @@ static inline unsigned long find_first_bit(const unsigned long *addr,
66
* @addr: The address to start the search at
67
* @size: The maximum size to search
68
*
69
- * Returns the bit number of the first cleared bit.
70
+ * Returns the bit number of the first cleared bit,
71
+ * or @size if there is no clear bit in the bitmap.
72
*/
73
static inline unsigned long find_first_zero_bit(const unsigned long *addr,
74
unsigned long size)
32
--
75
--
33
2.28.0
76
2.31.1
34
77
diff view generated by jsdifflib
1
From: Elena Afanasova <eafanasova@gmail.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
Eventfd can be registered with a zero length when fast_mmio is true.
3
The lifetime of the timer is well-known (it cannot outlive
4
Handle this case properly when dispatching through QEMU.
4
qemu_co_sleep_ns_wakeable, because it's deleted by the time the
5
coroutine resumes), so it is not necessary to place it on the heap.
5
6
6
Signed-off-by: Elena Afanasova <eafanasova@gmail.com>
7
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Message-id: cf71a62eb04e61932ff8ffdd02e0b2aab4f495a0.camel@gmail.com
8
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9
Message-id: 20210517100548.28806-2-pbonzini@redhat.com
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
---
11
---
10
softmmu/memory.c | 11 +++++++++--
12
util/qemu-coroutine-sleep.c | 9 ++++-----
11
1 file changed, 9 insertions(+), 2 deletions(-)
13
1 file changed, 4 insertions(+), 5 deletions(-)
12
14
13
diff --git a/softmmu/memory.c b/softmmu/memory.c
15
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
14
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
15
--- a/softmmu/memory.c
17
--- a/util/qemu-coroutine-sleep.c
16
+++ b/softmmu/memory.c
18
+++ b/util/qemu-coroutine-sleep.c
17
@@ -XXX,XX +XXX,XX @@ static bool memory_region_ioeventfd_before(MemoryRegionIoeventfd *a,
19
@@ -XXX,XX +XXX,XX @@ static const char *qemu_co_sleep_ns__scheduled = "qemu_co_sleep_ns";
18
static bool memory_region_ioeventfd_equal(MemoryRegionIoeventfd *a,
20
19
MemoryRegionIoeventfd *b)
21
struct QemuCoSleepState {
20
{
22
Coroutine *co;
21
- return !memory_region_ioeventfd_before(a, b)
23
- QEMUTimer *ts;
22
- && !memory_region_ioeventfd_before(b, a);
24
+ QEMUTimer ts;
23
+ if (int128_eq(a->addr.start, b->addr.start) &&
25
QemuCoSleepState **user_state_pointer;
24
+ (!int128_nz(a->addr.size) || !int128_nz(b->addr.size) ||
26
};
25
+ (int128_eq(a->addr.size, b->addr.size) &&
27
26
+ (a->match_data == b->match_data) &&
28
@@ -XXX,XX +XXX,XX @@ void qemu_co_sleep_wake(QemuCoSleepState *sleep_state)
27
+ ((a->match_data && (a->data == b->data)) || !a->match_data) &&
29
if (sleep_state->user_state_pointer) {
28
+ (a->e == b->e))))
30
*sleep_state->user_state_pointer = NULL;
29
+ return true;
31
}
30
+
32
- timer_del(sleep_state->ts);
31
+ return false;
33
+ timer_del(&sleep_state->ts);
34
aio_co_wake(sleep_state->co);
32
}
35
}
33
36
34
/* Range of memory in the global map. Addresses are absolute. */
37
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
38
AioContext *ctx = qemu_get_current_aio_context();
39
QemuCoSleepState state = {
40
.co = qemu_coroutine_self(),
41
- .ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, &state),
42
.user_state_pointer = sleep_state,
43
};
44
45
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
46
abort();
47
}
48
49
+ aio_timer_init(ctx, &state.ts, type, SCALE_NS, co_sleep_cb, &state);
50
if (sleep_state) {
51
*sleep_state = &state;
52
}
53
- timer_mod(state.ts, qemu_clock_get_ns(type) + ns);
54
+ timer_mod(&state.ts, qemu_clock_get_ns(type) + ns);
55
qemu_coroutine_yield();
56
if (sleep_state) {
57
/*
58
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
59
*/
60
assert(*sleep_state == NULL);
61
}
62
- timer_free(state.ts);
63
}
35
--
64
--
36
2.28.0
65
2.31.1
37
66
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
The "block/nvme.h" header is shared by both the NVMe block
4
driver and the NVMe emulated device. Add the 'F:' entry on
5
both sections, so all maintainers/reviewers are notified
6
when it is changed.
7
8
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
11
Message-Id: <20200701140634.25994-1-philmd@redhat.com>
12
---
13
MAINTAINERS | 2 ++
14
1 file changed, 2 insertions(+)
15
16
diff --git a/MAINTAINERS b/MAINTAINERS
17
index XXXXXXX..XXXXXXX 100644
18
--- a/MAINTAINERS
19
+++ b/MAINTAINERS
20
@@ -XXX,XX +XXX,XX @@ M: Klaus Jensen <its@irrelevant.dk>
21
L: qemu-block@nongnu.org
22
S: Supported
23
F: hw/block/nvme*
24
+F: include/block/nvme.h
25
F: tests/qtest/nvme-test.c
26
F: docs/specs/nvme.txt
27
T: git git://git.infradead.org/qemu-nvme.git nvme-next
28
@@ -XXX,XX +XXX,XX @@ R: Fam Zheng <fam@euphon.net>
29
L: qemu-block@nongnu.org
30
S: Supported
31
F: block/nvme*
32
+F: include/block/nvme.h
33
T: git https://github.com/stefanha/qemu.git block
34
35
Bootdevice
36
--
37
2.28.0
38
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Use the same format used for the hw/vfio/ trace events.
4
5
Suggested-by: Eric Auger <eric.auger@redhat.com>
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201029093306.1063879-3-philmd@redhat.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Tested-by: Eric Auger <eric.auger@redhat.com>
13
---
14
block/trace-events | 12 ++++++------
15
1 file changed, 6 insertions(+), 6 deletions(-)
16
17
diff --git a/block/trace-events b/block/trace-events
18
index XXXXXXX..XXXXXXX 100644
19
--- a/block/trace-events
20
+++ b/block/trace-events
21
@@ -XXX,XX +XXX,XX @@ nvme_submit_command(void *s, int index, int cid) "s %p queue %d cid %d"
22
nvme_submit_command_raw(int c0, int c1, int c2, int c3, int c4, int c5, int c6, int c7) "%02x %02x %02x %02x %02x %02x %02x %02x"
23
nvme_handle_event(void *s) "s %p"
24
nvme_poll_cb(void *s) "s %p"
25
-nvme_prw_aligned(void *s, int is_write, uint64_t offset, uint64_t bytes, int flags, int niov) "s %p is_write %d offset %"PRId64" bytes %"PRId64" flags %d niov %d"
26
-nvme_write_zeroes(void *s, uint64_t offset, uint64_t bytes, int flags) "s %p offset %"PRId64" bytes %"PRId64" flags %d"
27
+nvme_prw_aligned(void *s, int is_write, uint64_t offset, uint64_t bytes, int flags, int niov) "s %p is_write %d offset 0x%"PRIx64" bytes %"PRId64" flags %d niov %d"
28
+nvme_write_zeroes(void *s, uint64_t offset, uint64_t bytes, int flags) "s %p offset 0x%"PRIx64" bytes %"PRId64" flags %d"
29
nvme_qiov_unaligned(const void *qiov, int n, void *base, size_t size, int align) "qiov %p n %d base %p size 0x%zx align 0x%x"
30
-nvme_prw_buffered(void *s, uint64_t offset, uint64_t bytes, int niov, int is_write) "s %p offset %"PRId64" bytes %"PRId64" niov %d is_write %d"
31
-nvme_rw_done(void *s, int is_write, uint64_t offset, uint64_t bytes, int ret) "s %p is_write %d offset %"PRId64" bytes %"PRId64" ret %d"
32
-nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset %"PRId64" bytes %"PRId64""
33
-nvme_dsm_done(void *s, uint64_t offset, uint64_t bytes, int ret) "s %p offset %"PRId64" bytes %"PRId64" ret %d"
34
+nvme_prw_buffered(void *s, uint64_t offset, uint64_t bytes, int niov, int is_write) "s %p offset 0x%"PRIx64" bytes %"PRId64" niov %d is_write %d"
35
+nvme_rw_done(void *s, int is_write, uint64_t offset, uint64_t bytes, int ret) "s %p is_write %d offset 0x%"PRIx64" bytes %"PRId64" ret %d"
36
+nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset 0x%"PRIx64" bytes %"PRId64""
37
+nvme_dsm_done(void *s, uint64_t offset, uint64_t bytes, int ret) "s %p offset 0x%"PRIx64" bytes %"PRId64" ret %d"
38
nvme_dma_map_flush(void *s) "s %p"
39
nvme_free_req_queue_wait(void *q) "q %p"
40
nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) "s %p cmd %p req %p qiov %p entries %d"
41
--
42
2.28.0
43
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Instead of displaying warning on stderr, use warn_report()
4
which also displays it on the monitor.
5
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201029093306.1063879-4-philmd@redhat.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Tested-by: Eric Auger <eric.auger@redhat.com>
13
---
14
block/nvme.c | 4 ++--
15
1 file changed, 2 insertions(+), 2 deletions(-)
16
17
diff --git a/block/nvme.c b/block/nvme.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/block/nvme.c
20
+++ b/block/nvme.c
21
@@ -XXX,XX +XXX,XX @@ static bool nvme_process_completion(NVMeQueuePair *q)
22
}
23
cid = le16_to_cpu(c->cid);
24
if (cid == 0 || cid > NVME_QUEUE_SIZE) {
25
- fprintf(stderr, "Unexpected CID in completion queue: %" PRIu32 "\n",
26
- cid);
27
+ warn_report("NVMe: Unexpected CID in completion queue: %"PRIu32", "
28
+ "queue size: %u", cid, NVME_QUEUE_SIZE);
29
continue;
30
}
31
trace_nvme_complete_command(s, q->index, cid);
32
--
33
2.28.0
34
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Controllers have different capabilities and report them in the
4
CAP register. We are particularly interested by the page size
5
limits.
6
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Eric Auger <eric.auger@redhat.com>
9
Tested-by: Eric Auger <eric.auger@redhat.com>
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Message-id: 20201029093306.1063879-5-philmd@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
15
block/nvme.c | 13 +++++++++++++
16
block/trace-events | 2 ++
17
2 files changed, 15 insertions(+)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
24
* Initialization". */
25
26
cap = le64_to_cpu(regs->cap);
27
+ trace_nvme_controller_capability_raw(cap);
28
+ trace_nvme_controller_capability("Maximum Queue Entries Supported",
29
+ 1 + NVME_CAP_MQES(cap));
30
+ trace_nvme_controller_capability("Contiguous Queues Required",
31
+ NVME_CAP_CQR(cap));
32
+ trace_nvme_controller_capability("Doorbell Stride",
33
+ 2 << (2 + NVME_CAP_DSTRD(cap)));
34
+ trace_nvme_controller_capability("Subsystem Reset Supported",
35
+ NVME_CAP_NSSRS(cap));
36
+ trace_nvme_controller_capability("Memory Page Size Minimum",
37
+ 1 << (12 + NVME_CAP_MPSMIN(cap)));
38
+ trace_nvme_controller_capability("Memory Page Size Maximum",
39
+ 1 << (12 + NVME_CAP_MPSMAX(cap)));
40
if (!NVME_CAP_CSS(cap)) {
41
error_setg(errp, "Device doesn't support NVMe command set");
42
ret = -EINVAL;
43
diff --git a/block/trace-events b/block/trace-events
44
index XXXXXXX..XXXXXXX 100644
45
--- a/block/trace-events
46
+++ b/block/trace-events
47
@@ -XXX,XX +XXX,XX @@ qed_aio_write_postfill(void *s, void *acb, uint64_t start, size_t len, uint64_t
48
qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu"
49
50
# nvme.c
51
+nvme_controller_capability_raw(uint64_t value) "0x%08"PRIx64
52
+nvme_controller_capability(const char *desc, uint64_t value) "%s: %"PRIu64
53
nvme_kick(void *s, int queue) "s %p queue %d"
54
nvme_dma_flush_queue_wait(void *s) "s %p"
55
nvme_error(int cmd_specific, int sq_head, int sqid, int cid, int status) "cmd_specific %d sq_head %d sqid %d cid %d status 0x%x"
56
--
57
2.28.0
58
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
As we want to enable multiple queues, report the event
3
Simplify the code by removing conditionals. qemu_co_sleep_ns
4
in each nvme_poll_queue() call, rather than once in
4
can simply point the argument to an on-stack temporary.
5
the callback calling nvme_poll_queues().
6
5
7
Reviewed-by: Eric Auger <eric.auger@redhat.com>
6
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9
Tested-by: Eric Auger <eric.auger@redhat.com>
8
Message-id: 20210517100548.28806-3-pbonzini@redhat.com
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Message-id: 20201029093306.1063879-6-philmd@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
10
---
15
block/nvme.c | 2 +-
11
include/qemu/coroutine.h | 5 +++--
16
block/trace-events | 2 +-
12
util/qemu-coroutine-sleep.c | 18 +++++-------------
17
2 files changed, 2 insertions(+), 2 deletions(-)
13
2 files changed, 8 insertions(+), 15 deletions(-)
18
14
19
diff --git a/block/nvme.c b/block/nvme.c
15
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
20
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
17
--- a/include/qemu/coroutine.h
22
+++ b/block/nvme.c
18
+++ b/include/qemu/coroutine.h
23
@@ -XXX,XX +XXX,XX @@ static bool nvme_poll_queue(NVMeQueuePair *q)
19
@@ -XXX,XX +XXX,XX @@ typedef struct QemuCoSleepState QemuCoSleepState;
24
const size_t cqe_offset = q->cq.head * NVME_CQ_ENTRY_BYTES;
20
25
NvmeCqe *cqe = (NvmeCqe *)&q->cq.queue[cqe_offset];
21
/**
26
22
* Yield the coroutine for a given duration. During this yield, @sleep_state
27
+ trace_nvme_poll_queue(q->s, q->index);
23
- * (if not NULL) is set to an opaque pointer, which may be used for
28
/*
24
+ * is set to an opaque pointer, which may be used for
29
* Do an early check for completions. q->lock isn't needed because
25
* qemu_co_sleep_wake(). Be careful, the pointer is set back to zero when the
30
* nvme_process_completion() only runs in the event loop thread and
26
* timer fires. Don't save the obtained value to other variables and don't call
31
@@ -XXX,XX +XXX,XX @@ static bool nvme_poll_cb(void *opaque)
27
* qemu_co_sleep_wake from another aio context.
32
BDRVNVMeState *s = container_of(e, BDRVNVMeState,
28
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
33
irq_notifier[MSIX_SHARED_IRQ_IDX]);
29
QemuCoSleepState **sleep_state);
34
30
static inline void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns)
35
- trace_nvme_poll_cb(s);
31
{
36
return nvme_poll_queues(s);
32
- qemu_co_sleep_ns_wakeable(type, ns, NULL);
33
+ QemuCoSleepState *unused = NULL;
34
+ qemu_co_sleep_ns_wakeable(type, ns, &unused);
37
}
35
}
38
36
39
diff --git a/block/trace-events b/block/trace-events
37
/**
38
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
40
index XXXXXXX..XXXXXXX 100644
39
index XXXXXXX..XXXXXXX 100644
41
--- a/block/trace-events
40
--- a/util/qemu-coroutine-sleep.c
42
+++ b/block/trace-events
41
+++ b/util/qemu-coroutine-sleep.c
43
@@ -XXX,XX +XXX,XX @@ nvme_complete_command(void *s, int index, int cid) "s %p queue %d cid %d"
42
@@ -XXX,XX +XXX,XX @@ void qemu_co_sleep_wake(QemuCoSleepState *sleep_state)
44
nvme_submit_command(void *s, int index, int cid) "s %p queue %d cid %d"
43
qemu_co_sleep_ns__scheduled, NULL);
45
nvme_submit_command_raw(int c0, int c1, int c2, int c3, int c4, int c5, int c6, int c7) "%02x %02x %02x %02x %02x %02x %02x %02x"
44
46
nvme_handle_event(void *s) "s %p"
45
assert(scheduled == qemu_co_sleep_ns__scheduled);
47
-nvme_poll_cb(void *s) "s %p"
46
- if (sleep_state->user_state_pointer) {
48
+nvme_poll_queue(void *s, unsigned q_index) "s %p q #%u"
47
- *sleep_state->user_state_pointer = NULL;
49
nvme_prw_aligned(void *s, int is_write, uint64_t offset, uint64_t bytes, int flags, int niov) "s %p is_write %d offset 0x%"PRIx64" bytes %"PRId64" flags %d niov %d"
48
- }
50
nvme_write_zeroes(void *s, uint64_t offset, uint64_t bytes, int flags) "s %p offset 0x%"PRIx64" bytes %"PRId64" flags %d"
49
+ *sleep_state->user_state_pointer = NULL;
51
nvme_qiov_unaligned(const void *qiov, int n, void *base, size_t size, int align) "qiov %p n %d base %p size 0x%zx align 0x%x"
50
timer_del(&sleep_state->ts);
51
aio_co_wake(sleep_state->co);
52
}
53
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
54
}
55
56
aio_timer_init(ctx, &state.ts, type, SCALE_NS, co_sleep_cb, &state);
57
- if (sleep_state) {
58
- *sleep_state = &state;
59
- }
60
+ *sleep_state = &state;
61
timer_mod(&state.ts, qemu_clock_get_ns(type) + ns);
62
qemu_coroutine_yield();
63
- if (sleep_state) {
64
- /*
65
- * Note that *sleep_state is cleared during qemu_co_sleep_wake
66
- * before resuming this coroutine.
67
- */
68
- assert(*sleep_state == NULL);
69
- }
70
+
71
+ /* qemu_co_sleep_wake clears *sleep_state before resuming this coroutine. */
72
+ assert(*sleep_state == NULL);
73
}
52
--
74
--
53
2.28.0
75
2.31.1
54
76
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
What we want to trace is the block driver state and the queue index.
4
5
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201029093306.1063879-7-philmd@redhat.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Tested-by: Eric Auger <eric.auger@redhat.com>
13
---
14
block/nvme.c | 2 +-
15
block/trace-events | 2 +-
16
2 files changed, 2 insertions(+), 2 deletions(-)
17
18
diff --git a/block/nvme.c b/block/nvme.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/block/nvme.c
21
+++ b/block/nvme.c
22
@@ -XXX,XX +XXX,XX @@ static NVMeRequest *nvme_get_free_req(NVMeQueuePair *q)
23
24
while (q->free_req_head == -1) {
25
if (qemu_in_coroutine()) {
26
- trace_nvme_free_req_queue_wait(q);
27
+ trace_nvme_free_req_queue_wait(q->s, q->index);
28
qemu_co_queue_wait(&q->free_req_queue, &q->lock);
29
} else {
30
qemu_mutex_unlock(&q->lock);
31
diff --git a/block/trace-events b/block/trace-events
32
index XXXXXXX..XXXXXXX 100644
33
--- a/block/trace-events
34
+++ b/block/trace-events
35
@@ -XXX,XX +XXX,XX @@ nvme_rw_done(void *s, int is_write, uint64_t offset, uint64_t bytes, int ret) "s
36
nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset 0x%"PRIx64" bytes %"PRId64""
37
nvme_dsm_done(void *s, uint64_t offset, uint64_t bytes, int ret) "s %p offset 0x%"PRIx64" bytes %"PRId64" ret %d"
38
nvme_dma_map_flush(void *s) "s %p"
39
-nvme_free_req_queue_wait(void *q) "q %p"
40
+nvme_free_req_queue_wait(void *s, unsigned q_index) "s %p q #%u"
41
nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) "s %p cmd %p req %p qiov %p entries %d"
42
nvme_cmd_map_qiov_pages(void *s, int i, uint64_t page) "s %p page[%d] 0x%"PRIx64
43
nvme_cmd_map_qiov_iov(void *s, int i, void *page, int pages) "s %p iov[%d] %p pages %d"
44
--
45
2.28.0
46
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Reviewed-by: Eric Auger <eric.auger@redhat.com>
4
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
5
Tested-by: Eric Auger <eric.auger@redhat.com>
6
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
Message-id: 20201029093306.1063879-8-philmd@redhat.com
8
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Tested-by: Eric Auger <eric.auger@redhat.com>
10
---
11
block/nvme.c | 3 +++
12
block/trace-events | 2 ++
13
2 files changed, 5 insertions(+)
14
15
diff --git a/block/nvme.c b/block/nvme.c
16
index XXXXXXX..XXXXXXX 100644
17
--- a/block/nvme.c
18
+++ b/block/nvme.c
19
@@ -XXX,XX +XXX,XX @@ static void nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
20
21
static void nvme_free_queue_pair(NVMeQueuePair *q)
22
{
23
+ trace_nvme_free_queue_pair(q->index, q);
24
if (q->completion_bh) {
25
qemu_bh_delete(q->completion_bh);
26
}
27
@@ -XXX,XX +XXX,XX @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
28
if (!q) {
29
return NULL;
30
}
31
+ trace_nvme_create_queue_pair(idx, q, size, aio_context,
32
+ event_notifier_get_fd(s->irq_notifier));
33
q->prp_list_pages = qemu_try_memalign(s->page_size,
34
s->page_size * NVME_NUM_REQS);
35
if (!q->prp_list_pages) {
36
diff --git a/block/trace-events b/block/trace-events
37
index XXXXXXX..XXXXXXX 100644
38
--- a/block/trace-events
39
+++ b/block/trace-events
40
@@ -XXX,XX +XXX,XX @@ nvme_dsm(void *s, uint64_t offset, uint64_t bytes) "s %p offset 0x%"PRIx64" byte
41
nvme_dsm_done(void *s, uint64_t offset, uint64_t bytes, int ret) "s %p offset 0x%"PRIx64" bytes %"PRId64" ret %d"
42
nvme_dma_map_flush(void *s) "s %p"
43
nvme_free_req_queue_wait(void *s, unsigned q_index) "s %p q #%u"
44
+nvme_create_queue_pair(unsigned q_index, void *q, unsigned size, void *aio_context, int fd) "index %u q %p size %u aioctx %p fd %d"
45
+nvme_free_queue_pair(unsigned q_index, void *q) "index %u q %p"
46
nvme_cmd_map_qiov(void *s, void *cmd, void *req, void *qiov, int entries) "s %p cmd %p req %p qiov %p entries %d"
47
nvme_cmd_map_qiov_pages(void *s, int i, uint64_t page) "s %p page[%d] 0x%"PRIx64
48
nvme_cmd_map_qiov_iov(void *s, int i, void *page, int pages) "s %p iov[%d] %p pages %d"
49
--
50
2.28.0
51
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
To be able to use some definitions in structure declarations,
4
move them earlier. No logical change.
5
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201029093306.1063879-9-philmd@redhat.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Tested-by: Eric Auger <eric.auger@redhat.com>
13
---
14
block/nvme.c | 19 ++++++++++---------
15
1 file changed, 10 insertions(+), 9 deletions(-)
16
17
diff --git a/block/nvme.c b/block/nvme.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/block/nvme.c
20
+++ b/block/nvme.c
21
@@ -XXX,XX +XXX,XX @@
22
23
typedef struct BDRVNVMeState BDRVNVMeState;
24
25
+/* Same index is used for queues and IRQs */
26
+#define INDEX_ADMIN 0
27
+#define INDEX_IO(n) (1 + n)
28
+
29
+/* This driver shares a single MSIX IRQ for the admin and I/O queues */
30
+enum {
31
+ MSIX_SHARED_IRQ_IDX = 0,
32
+ MSIX_IRQ_COUNT = 1
33
+};
34
+
35
typedef struct {
36
int32_t head, tail;
37
uint8_t *queue;
38
@@ -XXX,XX +XXX,XX @@ typedef struct {
39
QEMUBH *completion_bh;
40
} NVMeQueuePair;
41
42
-#define INDEX_ADMIN 0
43
-#define INDEX_IO(n) (1 + n)
44
-
45
-/* This driver shares a single MSIX IRQ for the admin and I/O queues */
46
-enum {
47
- MSIX_SHARED_IRQ_IDX = 0,
48
- MSIX_IRQ_COUNT = 1
49
-};
50
-
51
struct BDRVNVMeState {
52
AioContext *aio_context;
53
QEMUVFIOState *vfio;
54
--
55
2.28.0
56
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
As all commands use the ADMIN queue, it is pointless to pass
3
All callers of qemu_co_sleep_wake are checking whether they are passing
4
it as argument each time. Remove the argument, and rename the
4
a NULL argument inside the pointer-to-pointer: do the check in
5
function as nvme_admin_cmd_sync() to make this new behavior
5
qemu_co_sleep_wake itself.
6
clearer.
7
6
8
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
As a side effect, qemu_co_sleep_wake can be called more than once and
9
Tested-by: Eric Auger <eric.auger@redhat.com>
8
it will only wake the coroutine once; after the first time, the argument
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
will be set to NULL via *sleep_state->user_state_pointer. However, this
11
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
would not be safe unless co_sleep_cb keeps using the QemuCoSleepState*
12
Message-id: 20201029093306.1063879-17-philmd@redhat.com
11
directly, so make it go through the pointer-to-pointer instead.
12
13
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
14
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
15
Message-id: 20210517100548.28806-4-pbonzini@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
16
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
17
---
16
block/nvme.c | 19 ++++++++++---------
18
block/block-copy.c | 4 +---
17
1 file changed, 10 insertions(+), 9 deletions(-)
19
block/nbd.c | 8 ++------
20
util/qemu-coroutine-sleep.c | 21 ++++++++++++---------
21
3 files changed, 15 insertions(+), 18 deletions(-)
18
22
19
diff --git a/block/nvme.c b/block/nvme.c
23
diff --git a/block/block-copy.c b/block/block-copy.c
20
index XXXXXXX..XXXXXXX 100644
24
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
25
--- a/block/block-copy.c
22
+++ b/block/nvme.c
26
+++ b/block/block-copy.c
23
@@ -XXX,XX +XXX,XX @@ static void nvme_submit_command(NVMeQueuePair *q, NVMeRequest *req,
27
@@ -XXX,XX +XXX,XX @@ out:
24
qemu_mutex_unlock(&q->lock);
28
29
void block_copy_kick(BlockCopyCallState *call_state)
30
{
31
- if (call_state->sleep_state) {
32
- qemu_co_sleep_wake(call_state->sleep_state);
33
- }
34
+ qemu_co_sleep_wake(call_state->sleep_state);
25
}
35
}
26
36
27
-static void nvme_cmd_sync_cb(void *opaque, int ret)
37
/*
28
+static void nvme_admin_cmd_sync_cb(void *opaque, int ret)
38
diff --git a/block/nbd.c b/block/nbd.c
39
index XXXXXXX..XXXXXXX 100644
40
--- a/block/nbd.c
41
+++ b/block/nbd.c
42
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn nbd_client_co_drain_begin(BlockDriverState *bs)
43
BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
44
45
s->drained = true;
46
- if (s->connection_co_sleep_ns_state) {
47
- qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
48
- }
49
+ qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
50
51
nbd_co_establish_connection_cancel(bs, false);
52
53
@@ -XXX,XX +XXX,XX @@ static void nbd_teardown_connection(BlockDriverState *bs)
54
55
s->state = NBD_CLIENT_QUIT;
56
if (s->connection_co) {
57
- if (s->connection_co_sleep_ns_state) {
58
- qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
59
- }
60
+ qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
61
nbd_co_establish_connection_cancel(bs, true);
62
}
63
if (qemu_in_coroutine()) {
64
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
65
index XXXXXXX..XXXXXXX 100644
66
--- a/util/qemu-coroutine-sleep.c
67
+++ b/util/qemu-coroutine-sleep.c
68
@@ -XXX,XX +XXX,XX @@ struct QemuCoSleepState {
69
70
void qemu_co_sleep_wake(QemuCoSleepState *sleep_state)
29
{
71
{
30
int *pret = opaque;
72
- /* Write of schedule protected by barrier write in aio_co_schedule */
31
*pret = ret;
73
- const char *scheduled = qatomic_cmpxchg(&sleep_state->co->scheduled,
32
aio_wait_kick();
74
- qemu_co_sleep_ns__scheduled, NULL);
75
+ if (sleep_state) {
76
+ /* Write of schedule protected by barrier write in aio_co_schedule */
77
+ const char *scheduled = qatomic_cmpxchg(&sleep_state->co->scheduled,
78
+ qemu_co_sleep_ns__scheduled, NULL);
79
80
- assert(scheduled == qemu_co_sleep_ns__scheduled);
81
- *sleep_state->user_state_pointer = NULL;
82
- timer_del(&sleep_state->ts);
83
- aio_co_wake(sleep_state->co);
84
+ assert(scheduled == qemu_co_sleep_ns__scheduled);
85
+ *sleep_state->user_state_pointer = NULL;
86
+ timer_del(&sleep_state->ts);
87
+ aio_co_wake(sleep_state->co);
88
+ }
33
}
89
}
34
90
35
-static int nvme_cmd_sync(BlockDriverState *bs, NVMeQueuePair *q,
91
static void co_sleep_cb(void *opaque)
36
- NvmeCmd *cmd)
37
+static int nvme_admin_cmd_sync(BlockDriverState *bs, NvmeCmd *cmd)
38
{
92
{
39
+ BDRVNVMeState *s = bs->opaque;
93
- qemu_co_sleep_wake(opaque);
40
+ NVMeQueuePair *q = s->queues[INDEX_ADMIN];
94
+ QemuCoSleepState **sleep_state = opaque;
41
AioContext *aio_context = bdrv_get_aio_context(bs);
95
+ qemu_co_sleep_wake(*sleep_state);
42
NVMeRequest *req;
96
}
43
int ret = -EINPROGRESS;
97
44
@@ -XXX,XX +XXX,XX @@ static int nvme_cmd_sync(BlockDriverState *bs, NVMeQueuePair *q,
98
void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
45
if (!req) {
99
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
46
return -EBUSY;
100
abort();
47
}
101
}
48
- nvme_submit_command(q, req, cmd, nvme_cmd_sync_cb, &ret);
102
49
+ nvme_submit_command(q, req, cmd, nvme_admin_cmd_sync_cb, &ret);
103
- aio_timer_init(ctx, &state.ts, type, SCALE_NS, co_sleep_cb, &state);
50
104
+ aio_timer_init(ctx, &state.ts, type, SCALE_NS, co_sleep_cb, sleep_state);
51
AIO_WAIT_WHILE(aio_context, ret == -EINPROGRESS);
105
*sleep_state = &state;
52
return ret;
106
timer_mod(&state.ts, qemu_clock_get_ns(type) + ns);
53
@@ -XXX,XX +XXX,XX @@ static bool nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
107
qemu_coroutine_yield();
54
55
memset(id, 0, sizeof(*id));
56
cmd.dptr.prp1 = cpu_to_le64(iova);
57
- if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
58
+ if (nvme_admin_cmd_sync(bs, &cmd)) {
59
error_setg(errp, "Failed to identify controller");
60
goto out;
61
}
62
@@ -XXX,XX +XXX,XX @@ static bool nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
63
memset(id, 0, sizeof(*id));
64
cmd.cdw10 = 0;
65
cmd.nsid = cpu_to_le32(namespace);
66
- if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
67
+ if (nvme_admin_cmd_sync(bs, &cmd)) {
68
error_setg(errp, "Failed to identify namespace");
69
goto out;
70
}
71
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
72
.cdw10 = cpu_to_le32(((queue_size - 1) << 16) | n),
73
.cdw11 = cpu_to_le32(NVME_CQ_IEN | NVME_CQ_PC),
74
};
75
- if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
76
+ if (nvme_admin_cmd_sync(bs, &cmd)) {
77
error_setg(errp, "Failed to create CQ io queue [%u]", n);
78
goto out_error;
79
}
80
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
81
.cdw10 = cpu_to_le32(((queue_size - 1) << 16) | n),
82
.cdw11 = cpu_to_le32(NVME_SQ_PC | (n << 16)),
83
};
84
- if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
85
+ if (nvme_admin_cmd_sync(bs, &cmd)) {
86
error_setg(errp, "Failed to create SQ io queue [%u]", n);
87
goto out_error;
88
}
89
@@ -XXX,XX +XXX,XX @@ static int nvme_enable_disable_write_cache(BlockDriverState *bs, bool enable,
90
.cdw11 = cpu_to_le32(enable ? 0x01 : 0x00),
91
};
92
93
- ret = nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd);
94
+ ret = nvme_admin_cmd_sync(bs, &cmd);
95
if (ret) {
96
error_setg(errp, "Failed to configure NVMe write cache");
97
}
98
--
108
--
99
2.28.0
109
2.31.1
100
110
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
We can not have negative queue count/size/index, use unsigned type.
3
This simplification is enabled by the previous patch. Now aio_co_wake
4
Rename 'nr_queues' as 'queue_count' to match the spec naming.
4
will only be called once, therefore we do not care about a spurious
5
firing of the timer after a qemu_co_sleep_wake.
5
6
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Message-id: 20210517100548.28806-5-pbonzini@redhat.com
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201029093306.1063879-10-philmd@redhat.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Tested-by: Eric Auger <eric.auger@redhat.com>
13
---
11
---
14
block/nvme.c | 38 ++++++++++++++++++--------------------
12
util/qemu-coroutine-sleep.c | 8 ++++----
15
block/trace-events | 10 +++++-----
13
1 file changed, 4 insertions(+), 4 deletions(-)
16
2 files changed, 23 insertions(+), 25 deletions(-)
17
14
18
diff --git a/block/nvme.c b/block/nvme.c
15
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
19
index XXXXXXX..XXXXXXX 100644
16
index XXXXXXX..XXXXXXX 100644
20
--- a/block/nvme.c
17
--- a/util/qemu-coroutine-sleep.c
21
+++ b/block/nvme.c
18
+++ b/util/qemu-coroutine-sleep.c
22
@@ -XXX,XX +XXX,XX @@ struct BDRVNVMeState {
19
@@ -XXX,XX +XXX,XX @@ static const char *qemu_co_sleep_ns__scheduled = "qemu_co_sleep_ns";
23
* [1..]: io queues.
20
24
*/
21
struct QemuCoSleepState {
25
NVMeQueuePair **queues;
22
Coroutine *co;
26
- int nr_queues;
23
- QEMUTimer ts;
27
+ unsigned queue_count;
24
QemuCoSleepState **user_state_pointer;
28
size_t page_size;
29
/* How many uint32_t elements does each doorbell entry take. */
30
size_t doorbell_scale;
31
@@ -XXX,XX +XXX,XX @@ static QemuOptsList runtime_opts = {
32
};
25
};
33
26
34
static void nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
27
@@ -XXX,XX +XXX,XX @@ void qemu_co_sleep_wake(QemuCoSleepState *sleep_state)
35
- int nentries, int entry_bytes, Error **errp)
28
36
+ unsigned nentries, size_t entry_bytes, Error **errp)
29
assert(scheduled == qemu_co_sleep_ns__scheduled);
30
*sleep_state->user_state_pointer = NULL;
31
- timer_del(&sleep_state->ts);
32
aio_co_wake(sleep_state->co);
33
}
34
}
35
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
36
QemuCoSleepState **sleep_state)
37
{
37
{
38
size_t bytes;
38
AioContext *ctx = qemu_get_current_aio_context();
39
int r;
39
+ QEMUTimer ts;
40
@@ -XXX,XX +XXX,XX @@ static void nvme_free_req_queue_cb(void *opaque)
40
QemuCoSleepState state = {
41
41
.co = qemu_coroutine_self(),
42
static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
42
.user_state_pointer = sleep_state,
43
AioContext *aio_context,
43
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
44
- int idx, int size,
44
abort();
45
+ unsigned idx, size_t size,
46
Error **errp)
47
{
48
int i, r;
49
@@ -XXX,XX +XXX,XX @@ static bool nvme_poll_queues(BDRVNVMeState *s)
50
bool progress = false;
51
int i;
52
53
- for (i = 0; i < s->nr_queues; i++) {
54
+ for (i = 0; i < s->queue_count; i++) {
55
if (nvme_poll_queue(s->queues[i])) {
56
progress = true;
57
}
58
@@ -XXX,XX +XXX,XX @@ static void nvme_handle_event(EventNotifier *n)
59
static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
60
{
61
BDRVNVMeState *s = bs->opaque;
62
- int n = s->nr_queues;
63
+ unsigned n = s->queue_count;
64
NVMeQueuePair *q;
65
NvmeCmd cmd;
66
- int queue_size = NVME_QUEUE_SIZE;
67
+ unsigned queue_size = NVME_QUEUE_SIZE;
68
69
q = nvme_create_queue_pair(s, bdrv_get_aio_context(bs),
70
n, queue_size, errp);
71
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
72
.cdw11 = cpu_to_le32(0x3),
73
};
74
if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
75
- error_setg(errp, "Failed to create CQ io queue [%d]", n);
76
+ error_setg(errp, "Failed to create CQ io queue [%u]", n);
77
goto out_error;
78
}
45
}
79
cmd = (NvmeCmd) {
46
80
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
47
- aio_timer_init(ctx, &state.ts, type, SCALE_NS, co_sleep_cb, sleep_state);
81
.cdw11 = cpu_to_le32(0x1 | (n << 16)),
48
+ aio_timer_init(ctx, &ts, type, SCALE_NS, co_sleep_cb, sleep_state);
82
};
49
*sleep_state = &state;
83
if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
50
- timer_mod(&state.ts, qemu_clock_get_ns(type) + ns);
84
- error_setg(errp, "Failed to create SQ io queue [%d]", n);
51
+ timer_mod(&ts, qemu_clock_get_ns(type) + ns);
85
+ error_setg(errp, "Failed to create SQ io queue [%u]", n);
52
qemu_coroutine_yield();
86
goto out_error;
53
+ timer_del(&ts);
87
}
54
88
s->queues = g_renew(NVMeQueuePair *, s->queues, n + 1);
55
/* qemu_co_sleep_wake clears *sleep_state before resuming this coroutine. */
89
s->queues[n] = q;
56
assert(*sleep_state == NULL);
90
- s->nr_queues++;
91
+ s->queue_count++;
92
return true;
93
out_error:
94
nvme_free_queue_pair(q);
95
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
96
ret = -EINVAL;
97
goto out;
98
}
99
- s->nr_queues = 1;
100
+ s->queue_count = 1;
101
QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
102
regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << AQA_ACQS_SHIFT) |
103
(NVME_QUEUE_SIZE << AQA_ASQS_SHIFT));
104
@@ -XXX,XX +XXX,XX @@ static int nvme_enable_disable_write_cache(BlockDriverState *bs, bool enable,
105
106
static void nvme_close(BlockDriverState *bs)
107
{
108
- int i;
109
BDRVNVMeState *s = bs->opaque;
110
111
- for (i = 0; i < s->nr_queues; ++i) {
112
+ for (unsigned i = 0; i < s->queue_count; ++i) {
113
nvme_free_queue_pair(s->queues[i]);
114
}
115
g_free(s->queues);
116
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int nvme_co_prw_aligned(BlockDriverState *bs,
117
};
118
119
trace_nvme_prw_aligned(s, is_write, offset, bytes, flags, qiov->niov);
120
- assert(s->nr_queues > 1);
121
+ assert(s->queue_count > 1);
122
req = nvme_get_free_req(ioq);
123
assert(req);
124
125
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int nvme_co_flush(BlockDriverState *bs)
126
.ret = -EINPROGRESS,
127
};
128
129
- assert(s->nr_queues > 1);
130
+ assert(s->queue_count > 1);
131
req = nvme_get_free_req(ioq);
132
assert(req);
133
nvme_submit_command(ioq, req, &cmd, nvme_rw_cb, &data);
134
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int nvme_co_pwrite_zeroes(BlockDriverState *bs,
135
cmd.cdw12 = cpu_to_le32(cdw12);
136
137
trace_nvme_write_zeroes(s, offset, bytes, flags);
138
- assert(s->nr_queues > 1);
139
+ assert(s->queue_count > 1);
140
req = nvme_get_free_req(ioq);
141
assert(req);
142
143
@@ -XXX,XX +XXX,XX @@ static int coroutine_fn nvme_co_pdiscard(BlockDriverState *bs,
144
return -ENOTSUP;
145
}
146
147
- assert(s->nr_queues > 1);
148
+ assert(s->queue_count > 1);
149
150
buf = qemu_try_memalign(s->page_size, s->page_size);
151
if (!buf) {
152
@@ -XXX,XX +XXX,XX @@ static void nvme_detach_aio_context(BlockDriverState *bs)
153
{
154
BDRVNVMeState *s = bs->opaque;
155
156
- for (int i = 0; i < s->nr_queues; i++) {
157
+ for (unsigned i = 0; i < s->queue_count; i++) {
158
NVMeQueuePair *q = s->queues[i];
159
160
qemu_bh_delete(q->completion_bh);
161
@@ -XXX,XX +XXX,XX @@ static void nvme_attach_aio_context(BlockDriverState *bs,
162
aio_set_event_notifier(new_context, &s->irq_notifier[MSIX_SHARED_IRQ_IDX],
163
false, nvme_handle_event, nvme_poll_cb);
164
165
- for (int i = 0; i < s->nr_queues; i++) {
166
+ for (unsigned i = 0; i < s->queue_count; i++) {
167
NVMeQueuePair *q = s->queues[i];
168
169
q->completion_bh =
170
@@ -XXX,XX +XXX,XX @@ static void nvme_aio_plug(BlockDriverState *bs)
171
172
static void nvme_aio_unplug(BlockDriverState *bs)
173
{
174
- int i;
175
BDRVNVMeState *s = bs->opaque;
176
assert(s->plugged);
177
s->plugged = false;
178
- for (i = INDEX_IO(0); i < s->nr_queues; i++) {
179
+ for (unsigned i = INDEX_IO(0); i < s->queue_count; i++) {
180
NVMeQueuePair *q = s->queues[i];
181
qemu_mutex_lock(&q->lock);
182
nvme_kick(q);
183
diff --git a/block/trace-events b/block/trace-events
184
index XXXXXXX..XXXXXXX 100644
185
--- a/block/trace-events
186
+++ b/block/trace-events
187
@@ -XXX,XX +XXX,XX @@ qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) "s
188
# nvme.c
189
nvme_controller_capability_raw(uint64_t value) "0x%08"PRIx64
190
nvme_controller_capability(const char *desc, uint64_t value) "%s: %"PRIu64
191
-nvme_kick(void *s, int queue) "s %p queue %d"
192
+nvme_kick(void *s, unsigned q_index) "s %p q #%u"
193
nvme_dma_flush_queue_wait(void *s) "s %p"
194
nvme_error(int cmd_specific, int sq_head, int sqid, int cid, int status) "cmd_specific %d sq_head %d sqid %d cid %d status 0x%x"
195
-nvme_process_completion(void *s, int index, int inflight) "s %p queue %d inflight %d"
196
-nvme_process_completion_queue_plugged(void *s, int index) "s %p queue %d"
197
-nvme_complete_command(void *s, int index, int cid) "s %p queue %d cid %d"
198
-nvme_submit_command(void *s, int index, int cid) "s %p queue %d cid %d"
199
+nvme_process_completion(void *s, unsigned q_index, int inflight) "s %p q #%u inflight %d"
200
+nvme_process_completion_queue_plugged(void *s, unsigned q_index) "s %p q #%u"
201
+nvme_complete_command(void *s, unsigned q_index, int cid) "s %p q #%u cid %d"
202
+nvme_submit_command(void *s, unsigned q_index, int cid) "s %p q #%u cid %d"
203
nvme_submit_command_raw(int c0, int c1, int c2, int c3, int c4, int c5, int c6, int c7) "%02x %02x %02x %02x %02x %02x %02x %02x"
204
nvme_handle_event(void *s) "s %p"
205
nvme_poll_queue(void *s, unsigned q_index) "s %p q #%u"
206
--
57
--
207
2.28.0
58
2.31.1
208
59
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
The QEMU_VFIO_DEBUG definition is only modifiable at build-time.
3
Right now, users of qemu_co_sleep_ns_wakeable are simply passing
4
Trace events can be enabled at run-time. As we prefer the latter,
4
a pointer to QemuCoSleepState by reference to the function. But
5
convert qemu_vfio_dump_mappings() to use trace events instead
5
QemuCoSleepState really is just a Coroutine*; making the
6
of fprintf().
6
content of the struct public is just as efficient and lets us
7
7
skip the user_state_pointer indirection.
8
Reviewed-by: Fam Zheng <fam@euphon.net>
8
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Since the usage is changed, take the occasion to rename the
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
struct to QemuCoSleep.
11
Message-id: 20201103020733.2303148-7-philmd@redhat.com
11
12
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
13
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
14
Message-id: 20210517100548.28806-6-pbonzini@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
15
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
16
---
15
util/vfio-helpers.c | 19 ++++---------------
17
include/qemu/coroutine.h | 23 +++++++++++----------
16
util/trace-events | 1 +
18
block/block-copy.c | 8 ++++----
17
2 files changed, 5 insertions(+), 15 deletions(-)
19
block/nbd.c | 10 ++++-----
18
20
util/qemu-coroutine-sleep.c | 41 ++++++++++++++++---------------------
19
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
21
4 files changed, 39 insertions(+), 43 deletions(-)
20
index XXXXXXX..XXXXXXX 100644
22
21
--- a/util/vfio-helpers.c
23
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
22
+++ b/util/vfio-helpers.c
24
index XXXXXXX..XXXXXXX 100644
23
@@ -XXX,XX +XXX,XX @@ QEMUVFIOState *qemu_vfio_open_pci(const char *device, Error **errp)
25
--- a/include/qemu/coroutine.h
24
return s;
26
+++ b/include/qemu/coroutine.h
25
}
27
@@ -XXX,XX +XXX,XX @@ void qemu_co_rwlock_wrlock(CoRwlock *lock);
26
28
*/
27
-static void qemu_vfio_dump_mapping(IOVAMapping *m)
29
void qemu_co_rwlock_unlock(CoRwlock *lock);
30
31
-typedef struct QemuCoSleepState QemuCoSleepState;
32
+typedef struct QemuCoSleep {
33
+ Coroutine *to_wake;
34
+} QemuCoSleep;
35
36
/**
37
- * Yield the coroutine for a given duration. During this yield, @sleep_state
38
- * is set to an opaque pointer, which may be used for
39
- * qemu_co_sleep_wake(). Be careful, the pointer is set back to zero when the
40
- * timer fires. Don't save the obtained value to other variables and don't call
41
- * qemu_co_sleep_wake from another aio context.
42
+ * Yield the coroutine for a given duration. Initializes @w so that,
43
+ * during this yield, it can be passed to qemu_co_sleep_wake() to
44
+ * terminate the sleep.
45
*/
46
-void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
47
- QemuCoSleepState **sleep_state);
48
+void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
49
+ QEMUClockType type, int64_t ns);
50
+
51
static inline void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns)
52
{
53
- QemuCoSleepState *unused = NULL;
54
- qemu_co_sleep_ns_wakeable(type, ns, &unused);
55
+ QemuCoSleep w = { 0 };
56
+ qemu_co_sleep_ns_wakeable(&w, type, ns);
57
}
58
59
/**
60
@@ -XXX,XX +XXX,XX @@ static inline void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns)
61
* qemu_co_sleep_ns() and should be checked to be non-NULL before calling
62
* qemu_co_sleep_wake().
63
*/
64
-void qemu_co_sleep_wake(QemuCoSleepState *sleep_state);
65
+void qemu_co_sleep_wake(QemuCoSleep *w);
66
67
/**
68
* Yield until a file descriptor becomes readable
69
diff --git a/block/block-copy.c b/block/block-copy.c
70
index XXXXXXX..XXXXXXX 100644
71
--- a/block/block-copy.c
72
+++ b/block/block-copy.c
73
@@ -XXX,XX +XXX,XX @@ typedef struct BlockCopyCallState {
74
/* State */
75
int ret;
76
bool finished;
77
- QemuCoSleepState *sleep_state;
78
+ QemuCoSleep sleep;
79
bool cancelled;
80
81
/* OUT parameters */
82
@@ -XXX,XX +XXX,XX @@ block_copy_dirty_clusters(BlockCopyCallState *call_state)
83
if (ns > 0) {
84
block_copy_task_end(task, -EAGAIN);
85
g_free(task);
86
- qemu_co_sleep_ns_wakeable(QEMU_CLOCK_REALTIME, ns,
87
- &call_state->sleep_state);
88
+ qemu_co_sleep_ns_wakeable(&call_state->sleep,
89
+ QEMU_CLOCK_REALTIME, ns);
90
continue;
91
}
92
}
93
@@ -XXX,XX +XXX,XX @@ out:
94
95
void block_copy_kick(BlockCopyCallState *call_state)
96
{
97
- qemu_co_sleep_wake(call_state->sleep_state);
98
+ qemu_co_sleep_wake(&call_state->sleep);
99
}
100
101
/*
102
diff --git a/block/nbd.c b/block/nbd.c
103
index XXXXXXX..XXXXXXX 100644
104
--- a/block/nbd.c
105
+++ b/block/nbd.c
106
@@ -XXX,XX +XXX,XX @@ typedef struct BDRVNBDState {
107
CoQueue free_sema;
108
Coroutine *connection_co;
109
Coroutine *teardown_co;
110
- QemuCoSleepState *connection_co_sleep_ns_state;
111
+ QemuCoSleep reconnect_sleep;
112
bool drained;
113
bool wait_drained_end;
114
int in_flight;
115
@@ -XXX,XX +XXX,XX @@ static void coroutine_fn nbd_client_co_drain_begin(BlockDriverState *bs)
116
BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
117
118
s->drained = true;
119
- qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
120
+ qemu_co_sleep_wake(&s->reconnect_sleep);
121
122
nbd_co_establish_connection_cancel(bs, false);
123
124
@@ -XXX,XX +XXX,XX @@ static void nbd_teardown_connection(BlockDriverState *bs)
125
126
s->state = NBD_CLIENT_QUIT;
127
if (s->connection_co) {
128
- qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
129
+ qemu_co_sleep_wake(&s->reconnect_sleep);
130
nbd_co_establish_connection_cancel(bs, true);
131
}
132
if (qemu_in_coroutine()) {
133
@@ -XXX,XX +XXX,XX @@ static coroutine_fn void nbd_co_reconnect_loop(BDRVNBDState *s)
134
}
135
bdrv_inc_in_flight(s->bs);
136
} else {
137
- qemu_co_sleep_ns_wakeable(QEMU_CLOCK_REALTIME, timeout,
138
- &s->connection_co_sleep_ns_state);
139
+ qemu_co_sleep_ns_wakeable(&s->reconnect_sleep,
140
+ QEMU_CLOCK_REALTIME, timeout);
141
if (s->drained) {
142
continue;
143
}
144
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
145
index XXXXXXX..XXXXXXX 100644
146
--- a/util/qemu-coroutine-sleep.c
147
+++ b/util/qemu-coroutine-sleep.c
148
@@ -XXX,XX +XXX,XX @@
149
150
static const char *qemu_co_sleep_ns__scheduled = "qemu_co_sleep_ns";
151
152
-struct QemuCoSleepState {
153
+void qemu_co_sleep_wake(QemuCoSleep *w)
154
+{
155
Coroutine *co;
156
- QemuCoSleepState **user_state_pointer;
157
-};
158
159
-void qemu_co_sleep_wake(QemuCoSleepState *sleep_state)
28
-{
160
-{
29
- if (QEMU_VFIO_DEBUG) {
161
- if (sleep_state) {
30
- printf(" vfio mapping %p %" PRIx64 " to %" PRIx64 "\n", m->host,
162
+ co = w->to_wake;
31
- (uint64_t)m->size, (uint64_t)m->iova);
163
+ w->to_wake = NULL;
32
- }
164
+ if (co) {
33
-}
165
/* Write of schedule protected by barrier write in aio_co_schedule */
34
-
166
- const char *scheduled = qatomic_cmpxchg(&sleep_state->co->scheduled,
35
static void qemu_vfio_dump_mappings(QEMUVFIOState *s)
167
+ const char *scheduled = qatomic_cmpxchg(&co->scheduled,
36
{
168
qemu_co_sleep_ns__scheduled, NULL);
37
- int i;
169
38
-
170
assert(scheduled == qemu_co_sleep_ns__scheduled);
39
- if (QEMU_VFIO_DEBUG) {
171
- *sleep_state->user_state_pointer = NULL;
40
- printf("vfio mappings\n");
172
- aio_co_wake(sleep_state->co);
41
- for (i = 0; i < s->nr_mappings; ++i) {
173
+ aio_co_wake(co);
42
- qemu_vfio_dump_mapping(&s->mappings[i]);
43
- }
44
+ for (int i = 0; i < s->nr_mappings; ++i) {
45
+ trace_qemu_vfio_dump_mapping(s->mappings[i].host,
46
+ s->mappings[i].iova,
47
+ s->mappings[i].size);
48
}
174
}
49
}
175
}
50
176
51
diff --git a/util/trace-events b/util/trace-events
177
static void co_sleep_cb(void *opaque)
52
index XXXXXXX..XXXXXXX 100644
178
{
53
--- a/util/trace-events
179
- QemuCoSleepState **sleep_state = opaque;
54
+++ b/util/trace-events
180
- qemu_co_sleep_wake(*sleep_state);
55
@@ -XXX,XX +XXX,XX @@ qemu_mutex_unlock(void *mutex, const char *file, const int line) "released mutex
181
+ QemuCoSleep *w = opaque;
56
qemu_vfio_dma_reset_temporary(void *s) "s %p"
182
+ qemu_co_sleep_wake(w);
57
qemu_vfio_ram_block_added(void *s, void *p, size_t size) "s %p host %p size 0x%zx"
183
}
58
qemu_vfio_ram_block_removed(void *s, void *p, size_t size) "s %p host %p size 0x%zx"
184
59
+qemu_vfio_dump_mapping(void *host, uint64_t iova, size_t size) "vfio mapping %p to iova 0x%08" PRIx64 " size 0x%zx"
185
-void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
60
qemu_vfio_find_mapping(void *s, void *p) "s %p host %p"
186
- QemuCoSleepState **sleep_state)
61
qemu_vfio_new_mapping(void *s, void *host, size_t size, int index, uint64_t iova) "s %p host %p size 0x%zx index %d iova 0x%"PRIx64
187
+void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
62
qemu_vfio_do_mapping(void *s, void *host, uint64_t iova, size_t size) "s %p host %p <-> iova 0x%"PRIx64 " size 0x%zx"
188
+ QEMUClockType type, int64_t ns)
189
{
190
+ Coroutine *co = qemu_coroutine_self();
191
AioContext *ctx = qemu_get_current_aio_context();
192
QEMUTimer ts;
193
- QemuCoSleepState state = {
194
- .co = qemu_coroutine_self(),
195
- .user_state_pointer = sleep_state,
196
- };
197
198
- const char *scheduled = qatomic_cmpxchg(&state.co->scheduled, NULL,
199
- qemu_co_sleep_ns__scheduled);
200
+ const char *scheduled = qatomic_cmpxchg(&co->scheduled, NULL,
201
+ qemu_co_sleep_ns__scheduled);
202
if (scheduled) {
203
fprintf(stderr,
204
"%s: Co-routine was already scheduled in '%s'\n",
205
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QEMUClockType type, int64_t ns,
206
abort();
207
}
208
209
- aio_timer_init(ctx, &ts, type, SCALE_NS, co_sleep_cb, sleep_state);
210
- *sleep_state = &state;
211
+ w->to_wake = co;
212
+ aio_timer_init(ctx, &ts, type, SCALE_NS, co_sleep_cb, w),
213
timer_mod(&ts, qemu_clock_get_ns(type) + ns);
214
qemu_coroutine_yield();
215
timer_del(&ts);
216
217
- /* qemu_co_sleep_wake clears *sleep_state before resuming this coroutine. */
218
- assert(*sleep_state == NULL);
219
+ /* w->to_wake is cleared before resuming this coroutine. */
220
+ assert(w->to_wake == NULL);
221
}
63
--
222
--
64
2.28.0
223
2.31.1
65
224
diff view generated by jsdifflib
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
1
From: Paolo Bonzini <pbonzini@redhat.com>
2
2
3
Just for consistency, following the example documented since
3
Allow using QemuCoSleep to sleep forever until woken by qemu_co_sleep_wake.
4
commit e3fe3988d7 ("error: Document Error API usage rules"),
4
This makes the logic of qemu_co_sleep_ns_wakeable easy to understand.
5
return a boolean value indicating an error is set or not.
6
Directly pass errp as the local_err is not requested in our
7
case.
8
5
9
Tested-by: Eric Auger <eric.auger@redhat.com>
6
In the future we will introduce an API that can work even if the
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
7
sleep and wake happen from different threads. For now, initializing
11
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
w->to_wake after timer_mod is fine because the timer can only fire in
12
Message-id: 20201029093306.1063879-11-philmd@redhat.com
9
the same AioContext.
10
11
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
12
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
13
Message-id: 20210517100548.28806-7-pbonzini@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
15
---
16
block/nvme.c | 12 +++++++-----
16
include/qemu/coroutine.h | 5 +++++
17
1 file changed, 7 insertions(+), 5 deletions(-)
17
util/qemu-coroutine-sleep.c | 26 +++++++++++++++++++-------
18
2 files changed, 24 insertions(+), 7 deletions(-)
18
19
19
diff --git a/block/nvme.c b/block/nvme.c
20
diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
20
index XXXXXXX..XXXXXXX 100644
21
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
--- a/include/qemu/coroutine.h
22
+++ b/block/nvme.c
23
+++ b/include/qemu/coroutine.h
23
@@ -XXX,XX +XXX,XX @@ static int nvme_cmd_sync(BlockDriverState *bs, NVMeQueuePair *q,
24
@@ -XXX,XX +XXX,XX @@ typedef struct QemuCoSleep {
24
return ret;
25
void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
26
QEMUClockType type, int64_t ns);
27
28
+/**
29
+ * Yield the coroutine until the next call to qemu_co_sleep_wake.
30
+ */
31
+void coroutine_fn qemu_co_sleep(QemuCoSleep *w);
32
+
33
static inline void coroutine_fn qemu_co_sleep_ns(QEMUClockType type, int64_t ns)
34
{
35
QemuCoSleep w = { 0 };
36
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
37
index XXXXXXX..XXXXXXX 100644
38
--- a/util/qemu-coroutine-sleep.c
39
+++ b/util/qemu-coroutine-sleep.c
40
@@ -XXX,XX +XXX,XX @@ static void co_sleep_cb(void *opaque)
41
qemu_co_sleep_wake(w);
25
}
42
}
26
43
27
-static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
44
-void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
28
+/* Returns true on success, false on failure. */
45
- QEMUClockType type, int64_t ns)
29
+static bool nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
46
+void coroutine_fn qemu_co_sleep(QemuCoSleep *w)
30
{
47
{
31
BDRVNVMeState *s = bs->opaque;
48
Coroutine *co = qemu_coroutine_self();
32
+ bool ret = false;
49
- AioContext *ctx = qemu_get_current_aio_context();
33
union {
50
- QEMUTimer ts;
34
NvmeIdCtrl ctrl;
51
35
NvmeIdNs ns;
52
const char *scheduled = qatomic_cmpxchg(&co->scheduled, NULL,
36
@@ -XXX,XX +XXX,XX @@ static void nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
53
qemu_co_sleep_ns__scheduled);
37
goto out;
54
@@ -XXX,XX +XXX,XX @@ void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
38
}
55
}
39
56
40
+ ret = true;
57
w->to_wake = co;
41
s->blkshift = lbaf->ds;
58
- aio_timer_init(ctx, &ts, type, SCALE_NS, co_sleep_cb, w),
42
out:
59
- timer_mod(&ts, qemu_clock_get_ns(type) + ns);
43
qemu_vfio_dma_unmap(s->vfio, id);
60
qemu_coroutine_yield();
44
qemu_vfree(id);
61
- timer_del(&ts);
62
63
/* w->to_wake is cleared before resuming this coroutine. */
64
assert(w->to_wake == NULL);
65
}
45
+
66
+
46
+ return ret;
67
+void coroutine_fn qemu_co_sleep_ns_wakeable(QemuCoSleep *w,
47
}
68
+ QEMUClockType type, int64_t ns)
48
69
+{
49
static bool nvme_poll_queue(NVMeQueuePair *q)
70
+ AioContext *ctx = qemu_get_current_aio_context();
50
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
71
+ QEMUTimer ts;
51
uint64_t cap;
72
+
52
uint64_t timeout_ms;
73
+ aio_timer_init(ctx, &ts, type, SCALE_NS, co_sleep_cb, w);
53
uint64_t deadline, now;
74
+ timer_mod(&ts, qemu_clock_get_ns(type) + ns);
54
- Error *local_err = NULL;
75
+
55
volatile NvmeBar *regs = NULL;
76
+ /*
56
77
+ * The timer will fire in the current AiOContext, so the callback
57
qemu_co_mutex_init(&s->dma_map_lock);
78
+ * must happen after qemu_co_sleep yields and there is no race
58
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
79
+ * between timer_mod and qemu_co_sleep.
59
&s->irq_notifier[MSIX_SHARED_IRQ_IDX],
80
+ */
60
false, nvme_handle_event, nvme_poll_cb);
81
+ qemu_co_sleep(w);
61
82
+ timer_del(&ts);
62
- nvme_identify(bs, namespace, &local_err);
83
+}
63
- if (local_err) {
64
- error_propagate(errp, local_err);
65
+ if (!nvme_identify(bs, namespace, errp)) {
66
ret = -EIO;
67
goto out;
68
}
69
--
84
--
70
2.28.0
85
2.31.1
71
86
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Just for consistency, following the example documented since
4
commit e3fe3988d7 ("error: Document Error API usage rules"),
5
return a boolean value indicating an error is set or not.
6
Directly pass errp as the local_err is not requested in our
7
case. This simplifies a bit nvme_create_queue_pair().
8
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201029093306.1063879-12-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
block/nvme.c | 16 +++++++---------
17
1 file changed, 7 insertions(+), 9 deletions(-)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static QemuOptsList runtime_opts = {
24
},
25
};
26
27
-static void nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
28
+/* Returns true on success, false on failure. */
29
+static bool nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
30
unsigned nentries, size_t entry_bytes, Error **errp)
31
{
32
size_t bytes;
33
@@ -XXX,XX +XXX,XX @@ static void nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
34
q->queue = qemu_try_memalign(s->page_size, bytes);
35
if (!q->queue) {
36
error_setg(errp, "Cannot allocate queue");
37
- return;
38
+ return false;
39
}
40
memset(q->queue, 0, bytes);
41
r = qemu_vfio_dma_map(s->vfio, q->queue, bytes, false, &q->iova);
42
if (r) {
43
error_setg(errp, "Cannot map queue");
44
+ return false;
45
}
46
+ return true;
47
}
48
49
static void nvme_free_queue_pair(NVMeQueuePair *q)
50
@@ -XXX,XX +XXX,XX @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
51
Error **errp)
52
{
53
int i, r;
54
- Error *local_err = NULL;
55
NVMeQueuePair *q;
56
uint64_t prp_list_iova;
57
58
@@ -XXX,XX +XXX,XX @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
59
req->prp_list_iova = prp_list_iova + i * s->page_size;
60
}
61
62
- nvme_init_queue(s, &q->sq, size, NVME_SQ_ENTRY_BYTES, &local_err);
63
- if (local_err) {
64
- error_propagate(errp, local_err);
65
+ if (!nvme_init_queue(s, &q->sq, size, NVME_SQ_ENTRY_BYTES, errp)) {
66
goto fail;
67
}
68
q->sq.doorbell = &s->doorbells[idx * s->doorbell_scale].sq_tail;
69
70
- nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES, &local_err);
71
- if (local_err) {
72
- error_propagate(errp, local_err);
73
+ if (!nvme_init_queue(s, &q->cq, size, NVME_CQ_ENTRY_BYTES, errp)) {
74
goto fail;
75
}
76
q->cq.doorbell = &s->doorbells[idx * s->doorbell_scale].cq_head;
77
--
78
2.28.0
79
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Rename Submission Queue flags with 'Sq' to differentiate
4
submission queue flags from command queue flags, and introduce
5
Completion Queue flag definitions.
6
7
Reviewed-by: Eric Auger <eric.auger@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Message-id: 20201029093306.1063879-13-philmd@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
15
include/block/nvme.h | 18 ++++++++++++------
16
1 file changed, 12 insertions(+), 6 deletions(-)
17
18
diff --git a/include/block/nvme.h b/include/block/nvme.h
19
index XXXXXXX..XXXXXXX 100644
20
--- a/include/block/nvme.h
21
+++ b/include/block/nvme.h
22
@@ -XXX,XX +XXX,XX @@ typedef struct QEMU_PACKED NvmeCreateCq {
23
#define NVME_CQ_FLAGS_PC(cq_flags) (cq_flags & 0x1)
24
#define NVME_CQ_FLAGS_IEN(cq_flags) ((cq_flags >> 1) & 0x1)
25
26
+enum NvmeFlagsCq {
27
+ NVME_CQ_PC = 1,
28
+ NVME_CQ_IEN = 2,
29
+};
30
+
31
typedef struct QEMU_PACKED NvmeCreateSq {
32
uint8_t opcode;
33
uint8_t flags;
34
@@ -XXX,XX +XXX,XX @@ typedef struct QEMU_PACKED NvmeCreateSq {
35
#define NVME_SQ_FLAGS_PC(sq_flags) (sq_flags & 0x1)
36
#define NVME_SQ_FLAGS_QPRIO(sq_flags) ((sq_flags >> 1) & 0x3)
37
38
-enum NvmeQueueFlags {
39
- NVME_Q_PC = 1,
40
- NVME_Q_PRIO_URGENT = 0,
41
- NVME_Q_PRIO_HIGH = 1,
42
- NVME_Q_PRIO_NORMAL = 2,
43
- NVME_Q_PRIO_LOW = 3,
44
+enum NvmeFlagsSq {
45
+ NVME_SQ_PC = 1,
46
+
47
+ NVME_SQ_PRIO_URGENT = 0,
48
+ NVME_SQ_PRIO_HIGH = 1,
49
+ NVME_SQ_PRIO_NORMAL = 2,
50
+ NVME_SQ_PRIO_LOW = 3,
51
};
52
53
typedef struct QEMU_PACKED NvmeIdentify {
54
--
55
2.28.0
56
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Replace magic values by definitions, and simplifiy since the
4
number of queues will never reach 64K.
5
6
Reviewed-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Tested-by: Eric Auger <eric.auger@redhat.com>
9
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
10
Message-id: 20201029093306.1063879-14-philmd@redhat.com
11
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
12
Tested-by: Eric Auger <eric.auger@redhat.com>
13
---
14
block/nvme.c | 9 +++++----
15
1 file changed, 5 insertions(+), 4 deletions(-)
16
17
diff --git a/block/nvme.c b/block/nvme.c
18
index XXXXXXX..XXXXXXX 100644
19
--- a/block/nvme.c
20
+++ b/block/nvme.c
21
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
22
NvmeCmd cmd;
23
unsigned queue_size = NVME_QUEUE_SIZE;
24
25
+ assert(n <= UINT16_MAX);
26
q = nvme_create_queue_pair(s, bdrv_get_aio_context(bs),
27
n, queue_size, errp);
28
if (!q) {
29
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
30
cmd = (NvmeCmd) {
31
.opcode = NVME_ADM_CMD_CREATE_CQ,
32
.dptr.prp1 = cpu_to_le64(q->cq.iova),
33
- .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
34
- .cdw11 = cpu_to_le32(0x3),
35
+ .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | n),
36
+ .cdw11 = cpu_to_le32(NVME_CQ_IEN | NVME_CQ_PC),
37
};
38
if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
39
error_setg(errp, "Failed to create CQ io queue [%u]", n);
40
@@ -XXX,XX +XXX,XX @@ static bool nvme_add_io_queue(BlockDriverState *bs, Error **errp)
41
cmd = (NvmeCmd) {
42
.opcode = NVME_ADM_CMD_CREATE_SQ,
43
.dptr.prp1 = cpu_to_le64(q->sq.iova),
44
- .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | (n & 0xFFFF)),
45
- .cdw11 = cpu_to_le32(0x1 | (n << 16)),
46
+ .cdw10 = cpu_to_le32(((queue_size - 1) << 16) | n),
47
+ .cdw11 = cpu_to_le32(NVME_SQ_PC | (n << 16)),
48
};
49
if (nvme_cmd_sync(bs, s->queues[INDEX_ADMIN], &cmd)) {
50
error_setg(errp, "Failed to create SQ io queue [%u]", n);
51
--
52
2.28.0
53
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
From the specification chapter 3.1.8 "AQA - Admin Queue Attributes"
4
the Admin Submission Queue Size field is a 0’s based value:
5
6
Admin Submission Queue Size (ASQS):
7
8
Defines the size of the Admin Submission Queue in entries.
9
Enabling a controller while this field is cleared to 00h
10
produces undefined results. The minimum size of the Admin
11
Submission Queue is two entries. The maximum size of the
12
Admin Submission Queue is 4096 entries.
13
This is a 0’s based value.
14
15
This bug has never been hit because the device initialization
16
uses a single command synchronously :)
17
18
Reviewed-by: Eric Auger <eric.auger@redhat.com>
19
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
20
Tested-by: Eric Auger <eric.auger@redhat.com>
21
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
22
Message-id: 20201029093306.1063879-15-philmd@redhat.com
23
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
24
Tested-by: Eric Auger <eric.auger@redhat.com>
25
---
26
block/nvme.c | 6 +++---
27
1 file changed, 3 insertions(+), 3 deletions(-)
28
29
diff --git a/block/nvme.c b/block/nvme.c
30
index XXXXXXX..XXXXXXX 100644
31
--- a/block/nvme.c
32
+++ b/block/nvme.c
33
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
34
goto out;
35
}
36
s->queue_count = 1;
37
- QEMU_BUILD_BUG_ON(NVME_QUEUE_SIZE & 0xF000);
38
- regs->aqa = cpu_to_le32((NVME_QUEUE_SIZE << AQA_ACQS_SHIFT) |
39
- (NVME_QUEUE_SIZE << AQA_ASQS_SHIFT));
40
+ QEMU_BUILD_BUG_ON((NVME_QUEUE_SIZE - 1) & 0xF000);
41
+ regs->aqa = cpu_to_le32(((NVME_QUEUE_SIZE - 1) << AQA_ACQS_SHIFT) |
42
+ ((NVME_QUEUE_SIZE - 1) << AQA_ASQS_SHIFT));
43
regs->asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
44
regs->acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
45
46
--
47
2.28.0
48
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
We don't need to dereference from BDRVNVMeState each time.
4
Use a NVMeQueuePair pointer on the admin queue.
5
The nvme_init() becomes easier to review, matching the style
6
of nvme_add_io_queue().
7
8
Reviewed-by: Eric Auger <eric.auger@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201029093306.1063879-16-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
block/nvme.c | 12 ++++++------
17
1 file changed, 6 insertions(+), 6 deletions(-)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
24
Error **errp)
25
{
26
BDRVNVMeState *s = bs->opaque;
27
+ NVMeQueuePair *q;
28
AioContext *aio_context = bdrv_get_aio_context(bs);
29
int ret;
30
uint64_t cap;
31
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
32
33
/* Set up admin queue. */
34
s->queues = g_new(NVMeQueuePair *, 1);
35
- s->queues[INDEX_ADMIN] = nvme_create_queue_pair(s, aio_context, 0,
36
- NVME_QUEUE_SIZE,
37
- errp);
38
- if (!s->queues[INDEX_ADMIN]) {
39
+ q = nvme_create_queue_pair(s, aio_context, 0, NVME_QUEUE_SIZE, errp);
40
+ if (!q) {
41
ret = -EINVAL;
42
goto out;
43
}
44
+ s->queues[INDEX_ADMIN] = q;
45
s->queue_count = 1;
46
QEMU_BUILD_BUG_ON((NVME_QUEUE_SIZE - 1) & 0xF000);
47
regs->aqa = cpu_to_le32(((NVME_QUEUE_SIZE - 1) << AQA_ACQS_SHIFT) |
48
((NVME_QUEUE_SIZE - 1) << AQA_ASQS_SHIFT));
49
- regs->asq = cpu_to_le64(s->queues[INDEX_ADMIN]->sq.iova);
50
- regs->acq = cpu_to_le64(s->queues[INDEX_ADMIN]->cq.iova);
51
+ regs->asq = cpu_to_le64(q->sq.iova);
52
+ regs->acq = cpu_to_le64(q->cq.iova);
53
54
/* After setting up all control registers we can enable device now. */
55
regs->cc = cpu_to_le32((ctz32(NVME_CQ_ENTRY_BYTES) << CC_IOCQES_SHIFT) |
56
--
57
2.28.0
58
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Commit bdd6a90a9e5 ("block: Add VFIO based NVMe driver")
4
sets the request_alignment in nvme_refresh_limits().
5
For consistency, also set it during initialization.
6
7
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
8
Reviewed-by: Eric Auger <eric.auger@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201029093306.1063879-18-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
block/nvme.c | 1 +
17
1 file changed, 1 insertion(+)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
24
s->page_size = MAX(4096, 1 << NVME_CAP_MPSMIN(cap));
25
s->doorbell_scale = (4 << NVME_CAP_DSTRD(cap)) / sizeof(uint32_t);
26
bs->bl.opt_mem_alignment = s->page_size;
27
+ bs->bl.request_alignment = s->page_size;
28
timeout_ms = MIN(500 * NVME_CAP_TO(cap), 30000);
29
30
/* Reset device to get a clean state. */
31
--
32
2.28.0
33
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
While trying to simplify the code using a macro, we forgot
4
the 12-bit shift... Correct that.
5
6
Fixes: fad1eb68862 ("block/nvme: Use register definitions from 'block/nvme.h'")
7
Reported-by: Eric Auger <eric.auger@redhat.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Reviewed-by: Eric Auger <eric.auger@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201029093306.1063879-19-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
block/nvme.c | 2 +-
17
1 file changed, 1 insertion(+), 1 deletion(-)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
24
goto out;
25
}
26
27
- s->page_size = MAX(4096, 1 << NVME_CAP_MPSMIN(cap));
28
+ s->page_size = 1u << (12 + NVME_CAP_MPSMIN(cap));
29
s->doorbell_scale = (4 << NVME_CAP_DSTRD(cap)) / sizeof(uint32_t);
30
bs->bl.opt_mem_alignment = s->page_size;
31
bs->bl.request_alignment = s->page_size;
32
--
33
2.28.0
34
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
In preparation of 64kB host page support, let's change the size
4
and alignment of the IDENTIFY command response buffer so that
5
the VFIO DMA MAP succeeds. We align on the host page size.
6
7
Signed-off-by: Eric Auger <eric.auger@redhat.com>
8
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201029093306.1063879-20-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
block/nvme.c | 9 +++++----
17
1 file changed, 5 insertions(+), 4 deletions(-)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static bool nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
24
.opcode = NVME_ADM_CMD_IDENTIFY,
25
.cdw10 = cpu_to_le32(0x1),
26
};
27
+ size_t id_size = QEMU_ALIGN_UP(sizeof(*id), qemu_real_host_page_size);
28
29
- id = qemu_try_memalign(s->page_size, sizeof(*id));
30
+ id = qemu_try_memalign(qemu_real_host_page_size, id_size);
31
if (!id) {
32
error_setg(errp, "Cannot allocate buffer for identify response");
33
goto out;
34
}
35
- r = qemu_vfio_dma_map(s->vfio, id, sizeof(*id), true, &iova);
36
+ r = qemu_vfio_dma_map(s->vfio, id, id_size, true, &iova);
37
if (r) {
38
error_setg(errp, "Cannot map buffer for DMA");
39
goto out;
40
}
41
42
- memset(id, 0, sizeof(*id));
43
+ memset(id, 0, id_size);
44
cmd.dptr.prp1 = cpu_to_le64(iova);
45
if (nvme_admin_cmd_sync(bs, &cmd)) {
46
error_setg(errp, "Failed to identify controller");
47
@@ -XXX,XX +XXX,XX @@ static bool nvme_identify(BlockDriverState *bs, int namespace, Error **errp)
48
s->supports_write_zeroes = !!(oncs & NVME_ONCS_WRITE_ZEROES);
49
s->supports_discard = !!(oncs & NVME_ONCS_DSM);
50
51
- memset(id, 0, sizeof(*id));
52
+ memset(id, 0, id_size);
53
cmd.cdw10 = 0;
54
cmd.nsid = cpu_to_le32(namespace);
55
if (nvme_admin_cmd_sync(bs, &cmd)) {
56
--
57
2.28.0
58
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
In preparation of 64kB host page support, let's change the size
4
and alignment of the queue so that the VFIO DMA MAP succeeds.
5
We align on the host page size.
6
7
Signed-off-by: Eric Auger <eric.auger@redhat.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Tested-by: Eric Auger <eric.auger@redhat.com>
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Message-id: 20201029093306.1063879-21-philmd@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
15
block/nvme.c | 4 ++--
16
1 file changed, 2 insertions(+), 2 deletions(-)
17
18
diff --git a/block/nvme.c b/block/nvme.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/block/nvme.c
21
+++ b/block/nvme.c
22
@@ -XXX,XX +XXX,XX @@ static bool nvme_init_queue(BDRVNVMeState *s, NVMeQueue *q,
23
size_t bytes;
24
int r;
25
26
- bytes = ROUND_UP(nentries * entry_bytes, s->page_size);
27
+ bytes = ROUND_UP(nentries * entry_bytes, qemu_real_host_page_size);
28
q->head = q->tail = 0;
29
- q->queue = qemu_try_memalign(s->page_size, bytes);
30
+ q->queue = qemu_try_memalign(qemu_real_host_page_size, bytes);
31
if (!q->queue) {
32
error_setg(errp, "Cannot allocate queue");
33
return false;
34
--
35
2.28.0
36
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
In preparation of 64kB host page support, let's change the size
4
and alignment of the prp_list_pages so that the VFIO DMA MAP succeeds
5
with 64kB host page size. We align on the host page size.
6
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Signed-off-by: Eric Auger <eric.auger@redhat.com>
9
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201029093306.1063879-22-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
block/nvme.c | 11 ++++++-----
17
1 file changed, 6 insertions(+), 5 deletions(-)
18
19
diff --git a/block/nvme.c b/block/nvme.c
20
index XXXXXXX..XXXXXXX 100644
21
--- a/block/nvme.c
22
+++ b/block/nvme.c
23
@@ -XXX,XX +XXX,XX @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
24
int i, r;
25
NVMeQueuePair *q;
26
uint64_t prp_list_iova;
27
+ size_t bytes;
28
29
q = g_try_new0(NVMeQueuePair, 1);
30
if (!q) {
31
@@ -XXX,XX +XXX,XX @@ static NVMeQueuePair *nvme_create_queue_pair(BDRVNVMeState *s,
32
}
33
trace_nvme_create_queue_pair(idx, q, size, aio_context,
34
event_notifier_get_fd(s->irq_notifier));
35
- q->prp_list_pages = qemu_try_memalign(s->page_size,
36
- s->page_size * NVME_NUM_REQS);
37
+ bytes = QEMU_ALIGN_UP(s->page_size * NVME_NUM_REQS,
38
+ qemu_real_host_page_size);
39
+ q->prp_list_pages = qemu_try_memalign(qemu_real_host_page_size, bytes);
40
if (!q->prp_list_pages) {
41
goto fail;
42
}
43
- memset(q->prp_list_pages, 0, s->page_size * NVME_NUM_REQS);
44
+ memset(q->prp_list_pages, 0, bytes);
45
qemu_mutex_init(&q->lock);
46
q->s = s;
47
q->index = idx;
48
qemu_co_queue_init(&q->free_req_queue);
49
q->completion_bh = aio_bh_new(aio_context, nvme_process_completion_bh, q);
50
- r = qemu_vfio_dma_map(s->vfio, q->prp_list_pages,
51
- s->page_size * NVME_NUM_REQS,
52
+ r = qemu_vfio_dma_map(s->vfio, q->prp_list_pages, bytes,
53
false, &prp_list_iova);
54
if (r) {
55
goto fail;
56
--
57
2.28.0
58
diff view generated by jsdifflib
Deleted patch
1
From: Eric Auger <eric.auger@redhat.com>
2
1
3
Make sure iov's va and size are properly aligned on the
4
host page size.
5
6
Signed-off-by: Eric Auger <eric.auger@redhat.com>
7
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
9
Tested-by: Eric Auger <eric.auger@redhat.com>
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Message-id: 20201029093306.1063879-23-philmd@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
15
block/nvme.c | 14 ++++++++------
16
1 file changed, 8 insertions(+), 6 deletions(-)
17
18
diff --git a/block/nvme.c b/block/nvme.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/block/nvme.c
21
+++ b/block/nvme.c
22
@@ -XXX,XX +XXX,XX @@ static coroutine_fn int nvme_cmd_map_qiov(BlockDriverState *bs, NvmeCmd *cmd,
23
for (i = 0; i < qiov->niov; ++i) {
24
bool retry = true;
25
uint64_t iova;
26
+ size_t len = QEMU_ALIGN_UP(qiov->iov[i].iov_len,
27
+ qemu_real_host_page_size);
28
try_map:
29
r = qemu_vfio_dma_map(s->vfio,
30
qiov->iov[i].iov_base,
31
- qiov->iov[i].iov_len,
32
- true, &iova);
33
+ len, true, &iova);
34
if (r == -ENOMEM && retry) {
35
retry = false;
36
trace_nvme_dma_flush_queue_wait(s);
37
@@ -XXX,XX +XXX,XX @@ static inline bool nvme_qiov_aligned(BlockDriverState *bs,
38
BDRVNVMeState *s = bs->opaque;
39
40
for (i = 0; i < qiov->niov; ++i) {
41
- if (!QEMU_PTR_IS_ALIGNED(qiov->iov[i].iov_base, s->page_size) ||
42
- !QEMU_IS_ALIGNED(qiov->iov[i].iov_len, s->page_size)) {
43
+ if (!QEMU_PTR_IS_ALIGNED(qiov->iov[i].iov_base,
44
+ qemu_real_host_page_size) ||
45
+ !QEMU_IS_ALIGNED(qiov->iov[i].iov_len, qemu_real_host_page_size)) {
46
trace_nvme_qiov_unaligned(qiov, i, qiov->iov[i].iov_base,
47
qiov->iov[i].iov_len, s->page_size);
48
return false;
49
@@ -XXX,XX +XXX,XX @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
50
int r;
51
uint8_t *buf = NULL;
52
QEMUIOVector local_qiov;
53
-
54
+ size_t len = QEMU_ALIGN_UP(bytes, qemu_real_host_page_size);
55
assert(QEMU_IS_ALIGNED(offset, s->page_size));
56
assert(QEMU_IS_ALIGNED(bytes, s->page_size));
57
assert(bytes <= s->max_transfer);
58
@@ -XXX,XX +XXX,XX @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
59
}
60
s->stats.unaligned_accesses++;
61
trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
62
- buf = qemu_try_memalign(s->page_size, bytes);
63
+ buf = qemu_try_memalign(qemu_real_host_page_size, len);
64
65
if (!buf) {
66
return -ENOMEM;
67
--
68
2.28.0
69
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
qemu_vfio_pci_map_bar() calls mmap(), and mmap(2) states:
4
5
'offset' must be a multiple of the page size as returned
6
by sysconf(_SC_PAGE_SIZE).
7
8
In commit f68453237b9 we started to use an offset of 4K which
9
broke this contract on Aarch64 arch.
10
11
Fix by mapping at offset 0, and and accessing doorbells at offset=4K.
12
13
Fixes: f68453237b9 ("block/nvme: Map doorbells pages write-only")
14
Reported-by: Eric Auger <eric.auger@redhat.com>
15
Reviewed-by: Eric Auger <eric.auger@redhat.com>
16
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
17
Tested-by: Eric Auger <eric.auger@redhat.com>
18
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
19
Message-id: 20201029093306.1063879-24-philmd@redhat.com
20
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
21
Tested-by: Eric Auger <eric.auger@redhat.com>
22
---
23
block/nvme.c | 11 +++++++----
24
1 file changed, 7 insertions(+), 4 deletions(-)
25
26
diff --git a/block/nvme.c b/block/nvme.c
27
index XXXXXXX..XXXXXXX 100644
28
--- a/block/nvme.c
29
+++ b/block/nvme.c
30
@@ -XXX,XX +XXX,XX @@ typedef struct {
31
struct BDRVNVMeState {
32
AioContext *aio_context;
33
QEMUVFIOState *vfio;
34
+ void *bar0_wo_map;
35
/* Memory mapped registers */
36
volatile struct {
37
uint32_t sq_tail;
38
@@ -XXX,XX +XXX,XX @@ static int nvme_init(BlockDriverState *bs, const char *device, int namespace,
39
}
40
}
41
42
- s->doorbells = qemu_vfio_pci_map_bar(s->vfio, 0, sizeof(NvmeBar),
43
- NVME_DOORBELL_SIZE, PROT_WRITE, errp);
44
+ s->bar0_wo_map = qemu_vfio_pci_map_bar(s->vfio, 0, 0,
45
+ sizeof(NvmeBar) + NVME_DOORBELL_SIZE,
46
+ PROT_WRITE, errp);
47
+ s->doorbells = (void *)((uintptr_t)s->bar0_wo_map + sizeof(NvmeBar));
48
if (!s->doorbells) {
49
ret = -EINVAL;
50
goto out;
51
@@ -XXX,XX +XXX,XX @@ static void nvme_close(BlockDriverState *bs)
52
&s->irq_notifier[MSIX_SHARED_IRQ_IDX],
53
false, NULL, NULL);
54
event_notifier_cleanup(&s->irq_notifier[MSIX_SHARED_IRQ_IDX]);
55
- qemu_vfio_pci_unmap_bar(s->vfio, 0, (void *)s->doorbells,
56
- sizeof(NvmeBar), NVME_DOORBELL_SIZE);
57
+ qemu_vfio_pci_unmap_bar(s->vfio, 0, s->bar0_wo_map,
58
+ 0, sizeof(NvmeBar) + NVME_DOORBELL_SIZE);
59
qemu_vfio_close(s->vfio);
60
61
g_free(s->device);
62
--
63
2.28.0
64
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
Change the confuse "VFIO IOMMU check failed" error message by
4
the explicit "VFIO IOMMU Type1 is not supported" once.
5
6
Example on POWER:
7
8
$ qemu-system-ppc64 -drive if=none,id=nvme0,file=nvme://0001:01:00.0/1,format=raw
9
qemu-system-ppc64: -drive if=none,id=nvme0,file=nvme://0001:01:00.0/1,format=raw: VFIO IOMMU Type1 is not supported
10
11
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
12
Reviewed-by: Fam Zheng <fam@euphon.net>
13
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
15
Message-id: 20201103020733.2303148-2-philmd@redhat.com
16
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
17
Tested-by: Eric Auger <eric.auger@redhat.com>
18
---
19
util/vfio-helpers.c | 2 +-
20
1 file changed, 1 insertion(+), 1 deletion(-)
21
22
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
23
index XXXXXXX..XXXXXXX 100644
24
--- a/util/vfio-helpers.c
25
+++ b/util/vfio-helpers.c
26
@@ -XXX,XX +XXX,XX @@ static int qemu_vfio_init_pci(QEMUVFIOState *s, const char *device,
27
}
28
29
if (!ioctl(s->container, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
30
- error_setg_errno(errp, errno, "VFIO IOMMU check failed");
31
+ error_setg_errno(errp, errno, "VFIO IOMMU Type1 is not supported");
32
ret = -EINVAL;
33
goto fail_container;
34
}
35
--
36
2.28.0
37
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
We sometime get kernel panic with some devices on Aarch64
4
hosts. Alex Williamson suggests it might be broken PCIe
5
root complex. Add trace event to record the latest I/O
6
access before crashing. In case, assert our accesses are
7
aligned.
8
9
Reviewed-by: Fam Zheng <fam@euphon.net>
10
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
11
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
12
Message-id: 20201103020733.2303148-3-philmd@redhat.com
13
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
14
Tested-by: Eric Auger <eric.auger@redhat.com>
15
---
16
util/vfio-helpers.c | 8 ++++++++
17
util/trace-events | 2 ++
18
2 files changed, 10 insertions(+)
19
20
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
21
index XXXXXXX..XXXXXXX 100644
22
--- a/util/vfio-helpers.c
23
+++ b/util/vfio-helpers.c
24
@@ -XXX,XX +XXX,XX @@ static int qemu_vfio_pci_read_config(QEMUVFIOState *s, void *buf,
25
{
26
int ret;
27
28
+ trace_qemu_vfio_pci_read_config(buf, ofs, size,
29
+ s->config_region_info.offset,
30
+ s->config_region_info.size);
31
+ assert(QEMU_IS_ALIGNED(s->config_region_info.offset + ofs, size));
32
do {
33
ret = pread(s->device, buf, size, s->config_region_info.offset + ofs);
34
} while (ret == -1 && errno == EINTR);
35
@@ -XXX,XX +XXX,XX @@ static int qemu_vfio_pci_write_config(QEMUVFIOState *s, void *buf, int size, int
36
{
37
int ret;
38
39
+ trace_qemu_vfio_pci_write_config(buf, ofs, size,
40
+ s->config_region_info.offset,
41
+ s->config_region_info.size);
42
+ assert(QEMU_IS_ALIGNED(s->config_region_info.offset + ofs, size));
43
do {
44
ret = pwrite(s->device, buf, size, s->config_region_info.offset + ofs);
45
} while (ret == -1 && errno == EINTR);
46
diff --git a/util/trace-events b/util/trace-events
47
index XXXXXXX..XXXXXXX 100644
48
--- a/util/trace-events
49
+++ b/util/trace-events
50
@@ -XXX,XX +XXX,XX @@ qemu_vfio_new_mapping(void *s, void *host, size_t size, int index, uint64_t iova
51
qemu_vfio_do_mapping(void *s, void *host, size_t size, uint64_t iova) "s %p host %p size 0x%zx iova 0x%"PRIx64
52
qemu_vfio_dma_map(void *s, void *host, size_t size, bool temporary, uint64_t *iova) "s %p host %p size 0x%zx temporary %d iova %p"
53
qemu_vfio_dma_unmap(void *s, void *host) "s %p host %p"
54
+qemu_vfio_pci_read_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "read cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
55
+qemu_vfio_pci_write_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "write cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
56
--
57
2.28.0
58
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
For debug purpose, trace BAR regions info.
4
5
Reviewed-by: Fam Zheng <fam@euphon.net>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Message-id: 20201103020733.2303148-4-philmd@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
---
12
util/vfio-helpers.c | 8 ++++++++
13
util/trace-events | 1 +
14
2 files changed, 9 insertions(+)
15
16
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/util/vfio-helpers.c
19
+++ b/util/vfio-helpers.c
20
@@ -XXX,XX +XXX,XX @@ static inline void assert_bar_index_valid(QEMUVFIOState *s, int index)
21
22
static int qemu_vfio_pci_init_bar(QEMUVFIOState *s, int index, Error **errp)
23
{
24
+ g_autofree char *barname = NULL;
25
assert_bar_index_valid(s, index);
26
s->bar_region_info[index] = (struct vfio_region_info) {
27
.index = VFIO_PCI_BAR0_REGION_INDEX + index,
28
@@ -XXX,XX +XXX,XX @@ static int qemu_vfio_pci_init_bar(QEMUVFIOState *s, int index, Error **errp)
29
error_setg_errno(errp, errno, "Failed to get BAR region info");
30
return -errno;
31
}
32
+ barname = g_strdup_printf("bar[%d]", index);
33
+ trace_qemu_vfio_region_info(barname, s->bar_region_info[index].offset,
34
+ s->bar_region_info[index].size,
35
+ s->bar_region_info[index].cap_offset);
36
37
return 0;
38
}
39
@@ -XXX,XX +XXX,XX @@ static int qemu_vfio_init_pci(QEMUVFIOState *s, const char *device,
40
ret = -errno;
41
goto fail;
42
}
43
+ trace_qemu_vfio_region_info("config", s->config_region_info.offset,
44
+ s->config_region_info.size,
45
+ s->config_region_info.cap_offset);
46
47
for (i = 0; i < ARRAY_SIZE(s->bar_region_info); i++) {
48
ret = qemu_vfio_pci_init_bar(s, i, errp);
49
diff --git a/util/trace-events b/util/trace-events
50
index XXXXXXX..XXXXXXX 100644
51
--- a/util/trace-events
52
+++ b/util/trace-events
53
@@ -XXX,XX +XXX,XX @@ qemu_vfio_dma_map(void *s, void *host, size_t size, bool temporary, uint64_t *io
54
qemu_vfio_dma_unmap(void *s, void *host) "s %p host %p"
55
qemu_vfio_pci_read_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "read cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
56
qemu_vfio_pci_write_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "write cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
57
+qemu_vfio_region_info(const char *desc, uint64_t region_ofs, uint64_t region_size, uint32_t cap_offset) "region '%s' addr 0x%"PRIx64" size 0x%"PRIx64" cap_ofs 0x%"PRIx32
58
--
59
2.28.0
60
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
For debugging purpose, trace where a BAR is mapped.
4
5
Reviewed-by: Fam Zheng <fam@euphon.net>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Message-id: 20201103020733.2303148-5-philmd@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
---
12
util/vfio-helpers.c | 2 ++
13
util/trace-events | 1 +
14
2 files changed, 3 insertions(+)
15
16
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/util/vfio-helpers.c
19
+++ b/util/vfio-helpers.c
20
@@ -XXX,XX +XXX,XX @@ void *qemu_vfio_pci_map_bar(QEMUVFIOState *s, int index,
21
p = mmap(NULL, MIN(size, s->bar_region_info[index].size - offset),
22
prot, MAP_SHARED,
23
s->device, s->bar_region_info[index].offset + offset);
24
+ trace_qemu_vfio_pci_map_bar(index, s->bar_region_info[index].offset ,
25
+ size, offset, p);
26
if (p == MAP_FAILED) {
27
error_setg_errno(errp, errno, "Failed to map BAR region");
28
p = NULL;
29
diff --git a/util/trace-events b/util/trace-events
30
index XXXXXXX..XXXXXXX 100644
31
--- a/util/trace-events
32
+++ b/util/trace-events
33
@@ -XXX,XX +XXX,XX @@ qemu_vfio_dma_unmap(void *s, void *host) "s %p host %p"
34
qemu_vfio_pci_read_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "read cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
35
qemu_vfio_pci_write_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "write cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
36
qemu_vfio_region_info(const char *desc, uint64_t region_ofs, uint64_t region_size, uint32_t cap_offset) "region '%s' addr 0x%"PRIx64" size 0x%"PRIx64" cap_ofs 0x%"PRIx32
37
+qemu_vfio_pci_map_bar(int index, uint64_t region_ofs, uint64_t region_size, int ofs, void *host) "map region bar#%d addr 0x%"PRIx64" size 0x%"PRIx64" ofs 0x%x host %p"
38
--
39
2.28.0
40
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
For debugging purpose, trace where DMA regions are mapped.
4
5
Reviewed-by: Fam Zheng <fam@euphon.net>
6
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
7
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
8
Message-id: 20201103020733.2303148-6-philmd@redhat.com
9
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
10
Tested-by: Eric Auger <eric.auger@redhat.com>
11
---
12
util/vfio-helpers.c | 3 ++-
13
util/trace-events | 5 +++--
14
2 files changed, 5 insertions(+), 3 deletions(-)
15
16
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
17
index XXXXXXX..XXXXXXX 100644
18
--- a/util/vfio-helpers.c
19
+++ b/util/vfio-helpers.c
20
@@ -XXX,XX +XXX,XX @@ static int qemu_vfio_do_mapping(QEMUVFIOState *s, void *host, size_t size,
21
.vaddr = (uintptr_t)host,
22
.size = size,
23
};
24
- trace_qemu_vfio_do_mapping(s, host, size, iova);
25
+ trace_qemu_vfio_do_mapping(s, host, iova, size);
26
27
if (ioctl(s->container, VFIO_IOMMU_MAP_DMA, &dma_map)) {
28
error_report("VFIO_MAP_DMA failed: %s", strerror(errno));
29
@@ -XXX,XX +XXX,XX @@ int qemu_vfio_dma_map(QEMUVFIOState *s, void *host, size_t size,
30
}
31
}
32
}
33
+ trace_qemu_vfio_dma_mapped(s, host, iova0, size);
34
if (iova) {
35
*iova = iova0;
36
}
37
diff --git a/util/trace-events b/util/trace-events
38
index XXXXXXX..XXXXXXX 100644
39
--- a/util/trace-events
40
+++ b/util/trace-events
41
@@ -XXX,XX +XXX,XX @@ qemu_vfio_ram_block_added(void *s, void *p, size_t size) "s %p host %p size 0x%z
42
qemu_vfio_ram_block_removed(void *s, void *p, size_t size) "s %p host %p size 0x%zx"
43
qemu_vfio_find_mapping(void *s, void *p) "s %p host %p"
44
qemu_vfio_new_mapping(void *s, void *host, size_t size, int index, uint64_t iova) "s %p host %p size 0x%zx index %d iova 0x%"PRIx64
45
-qemu_vfio_do_mapping(void *s, void *host, size_t size, uint64_t iova) "s %p host %p size 0x%zx iova 0x%"PRIx64
46
-qemu_vfio_dma_map(void *s, void *host, size_t size, bool temporary, uint64_t *iova) "s %p host %p size 0x%zx temporary %d iova %p"
47
+qemu_vfio_do_mapping(void *s, void *host, uint64_t iova, size_t size) "s %p host %p <-> iova 0x%"PRIx64 " size 0x%zx"
48
+qemu_vfio_dma_map(void *s, void *host, size_t size, bool temporary, uint64_t *iova) "s %p host %p size 0x%zx temporary %d &iova %p"
49
+qemu_vfio_dma_mapped(void *s, void *host, uint64_t iova, size_t size) "s %p host %p <-> iova 0x%"PRIx64" size 0x%zx"
50
qemu_vfio_dma_unmap(void *s, void *host) "s %p host %p"
51
qemu_vfio_pci_read_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "read cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
52
qemu_vfio_pci_write_config(void *buf, int ofs, int size, uint64_t region_ofs, uint64_t region_size) "write cfg ptr %p ofs 0x%x size 0x%x (region addr 0x%"PRIx64" size 0x%"PRIx64")"
53
--
54
2.28.0
55
diff view generated by jsdifflib
Deleted patch
1
From: Philippe Mathieu-Daudé <philmd@redhat.com>
2
1
3
mmap(2) states:
4
5
'offset' must be a multiple of the page size as returned
6
by sysconf(_SC_PAGE_SIZE).
7
8
Add an assertion to be sure we don't break this contract.
9
10
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
11
Message-id: 20201103020733.2303148-8-philmd@redhat.com
12
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
13
Tested-by: Eric Auger <eric.auger@redhat.com>
14
---
15
util/vfio-helpers.c | 1 +
16
1 file changed, 1 insertion(+)
17
18
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
19
index XXXXXXX..XXXXXXX 100644
20
--- a/util/vfio-helpers.c
21
+++ b/util/vfio-helpers.c
22
@@ -XXX,XX +XXX,XX @@ void *qemu_vfio_pci_map_bar(QEMUVFIOState *s, int index,
23
Error **errp)
24
{
25
void *p;
26
+ assert(QEMU_IS_ALIGNED(offset, qemu_real_host_page_size));
27
assert_bar_index_valid(s, index);
28
p = mmap(NULL, MIN(size, s->bar_region_info[index].size - offset),
29
prot, MAP_SHARED,
30
--
31
2.28.0
32
diff view generated by jsdifflib