[PULL 03/11] hw/dma/pl080: Fix transfer logic in PL080

Maintainers: Eric Auger <eric.auger@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, "Michael S. Tsirkin" <mst@redhat.com>, Igor Mammedov <imammedo@redhat.com>, Ani Sinha <anisinha@redhat.com>, Shannon Zhao <shannon.zhaosl@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Laurent Vivier <laurent@vivier.eu>, Pierrick Bouvier <pierrick.bouvier@linaro.org>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
[PULL 03/11] hw/dma/pl080: Fix transfer logic in PL080
Posted by Peter Maydell 1 week, 2 days ago
From: Tao Ding <dingtao0430@163.com>

The logic in the PL080 for transferring data has multiple bugs:

 * The TransferSize field in the channel control register counts
   in units of the source width; because our loop may do multiple
   source loads if the destination width is greater than the
   source width, we need to decrement it by (xsize / swidth),
   not by 1, each loop

 * It is documented in the TRM that it is a software error to program
   the source and destination width such that SWidth < DWidth and
   TransferSize * SWidth is not a multiple of DWidth. (This would
   mean that there isn't enough data to do a full final destination
   write.) We weren't doing anything sensible with this case.
   The TRM doesn't document what the hardware actually does (though
   it drops some hints that suggest that it probably over-reads
   from the source).

 * In the loop to write to the destination, each loop adds swidth
   to  ch->dest for each loop and also uses (ch->dest + n) as the
   destination address. This moves the destination address on
   further than we should each time round the loop, and also
   is incrementing ch->dest by swidth when it should be dwidth.

This patch fixes these problems:
 * decrement TransferSize by the correct amount
 * log and ignore the transfer size mismatch case
 * correct the loop logic for the destination writes

A repro case which exercises some of this is as follows.  It
configures swidth to 1 byte, dwidth to 4 bytes, and transfer size 4,
for a transfer from 0x00000000 to 0x000010000.  Examining the
destination memory in the QEMU monitor should show that the
source data 0x44332211 has all been copied, but before this
fix it is not:

    ./qemu-system-arm -M versatilepb -m 128M -nographic -S \
    -device loader,addr=0x00000000,data=0x44332211,data-len=4 \
    -device loader,addr=0x00001000,data=0x00000000,data-len=4 \
    -device loader,addr=0x10130030,data=0x00000001,data-len=4 \
    -device loader,addr=0x10130100,data=0x00000000,data-len=4 \
    -device loader,addr=0x10130104,data=0x00001000,data-len=4 \
    -device loader,addr=0x10130108,data=0x00000000,data-len=4 \
    -device loader,addr=0x1013010C,data=0x9e47f004,data-len=4 \
    -device loader,addr=0x10130110,data=0x0000c001,data-len=4

Without this patch the QEMU monitor shows:
    (qemu) xp /1wx 0x00001000
    00001000: 0x00002211

Correct result:
    (qemu) xp /1wx 0x00001000
    00001000: 0x44332211

Cc: qemu-stable@nongnu.org
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Tao Ding <dingtao0430@163.com>
[PMM: Wrote up what we are fixing in the commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/dma/pl080.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/dma/pl080.c b/hw/dma/pl080.c
index 627ccbbd81..4a90c7bb27 100644
--- a/hw/dma/pl080.c
+++ b/hw/dma/pl080.c
@@ -179,23 +179,28 @@ again:
                               c, extract32(ch->ctrl, 21, 3));
                 continue;
             }
-
-            for (n = 0; n < dwidth; n+= swidth) {
+            if ((size * swidth) % dwidth) {
+                qemu_log_mask(LOG_GUEST_ERROR,
+                    "pl080: channel %d: transfer size mismatch: size=%d swidth=%d dwidth=%d\n",
+                    c, size, swidth, dwidth);
+                continue;
+            }
+            xsize = MAX(swidth, dwidth);
+            for (n = 0; n < xsize; n += swidth) {
                 address_space_read(&s->downstream_as, ch->src,
                                    MEMTXATTRS_UNSPECIFIED, buff + n, swidth);
                 if (ch->ctrl & PL080_CCTRL_SI)
                     ch->src += swidth;
             }
-            xsize = (dwidth < swidth) ? swidth : dwidth;
             /* ??? This may pad the value incorrectly for dwidth < 32.  */
             for (n = 0; n < xsize; n += dwidth) {
-                address_space_write(&s->downstream_as, ch->dest + n,
+                address_space_write(&s->downstream_as, ch->dest,
                                     MEMTXATTRS_UNSPECIFIED, buff + n, dwidth);
                 if (ch->ctrl & PL080_CCTRL_DI)
-                    ch->dest += swidth;
+                    ch->dest += dwidth;
             }
 
-            size--;
+            size -= xsize / swidth;
             ch->ctrl = (ch->ctrl & 0xfffff000) | size;
             if (size == 0) {
                 /* Transfer complete.  */
-- 
2.43.0