Upon examining the current implementation for getting/setting SIMD
and SVE registers via remote GDB, there is a concern about mixed
endian support. This patch series aims to address this concern and
allow getting and setting the values of NEON and SVE registers via
remote GDB regardless of the target endianness.
Consider the following snippet from a GDB session in which a SIMD
register's value is set via remote GDB where the QEMU host is little
endian and the target is big endian:
(gdb) p/x $v0
$1 = {d = {f = {0x0, 0x0}, u = {0x0, 0x0}, s = {0x0, 0x0}}, s = {f =
{0x0, 0x0, 0x0, 0x0}, u = {0x0, 0x0, 0x0, 0x0}, s = {0x0, 0x0,
0x0, 0x0}}, h = {bf = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0}, f = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, u = {
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, s = {0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0}}, b = {u = {0x0 <repeats 16 times>},
s = {0x0 <repeats 16 times>}}, q = {u = {0x0}, s = {0x0}}}
(gdb) set $v0.d.u[0] = 0x010203
(gdb) p/x $v0
$2 = {d = {f = {0x302010000000000, 0x0}, u = {0x302010000000000, 0x0},
s = {0x302010000000000, 0x0}}, s = {f = {0x3020100, 0x0, 0x0,
0x0}, u = {0x3020100, 0x0, 0x0, 0x0}, s = {0x3020100, 0x0, 0x0,
0x0}}, h = {bf = {0x302, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
f = {0x302, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, u = {0x302,
0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, s = {0x302, 0x100, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0}}, b = {u = {0x3, 0x2, 0x1, 0x0
<repeats 13 times>}, s = {0x3, 0x2, 0x1, 0x0 <repeats 13 times>}},
q = {u = {0x3020100000000000000000000000000}, s = {
0x3020100000000000000000000000000}}}
The above snippet exemplifies an issue with how the SIMD register value
is set when the target endianness differs from the host endianness. A
similar issue is evident when setting SVE registers, as is shown by the
snippet below where the QEMU host is little endian and the target is big
endian:
(gdb) p/x $z0
$1 = {q = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}},
d = {f = {0x0 <repeats 32 times>}, u = {0x0 <repeats 32 times>},
s = {0x0 <repeats 32 times>}}, s = {f = {0x0 <repeats 64 times>},
u = {0x0 <repeats 64 times>}, s = {0x0 <repeats 64 times>}},
h = {f = {0x0 <repeats 128 times>}, u = {0x0 <repeats 128 times>},
s = {0x0 <repeats 128 times>}}, b = {u = {0x0 <repeats 256 times>},
s = {0x0 <repeats 256 times>}}}
(gdb) set $z0.q.u[0] = 0x010203
(gdb) p/x $z0
$2 = {q = {u = {0x302010000000000, 0x0 <repeats 15 times>}, s =
{0x302010000000000, 0x0 <repeats 15 times>}}, d = {f = {0x0,
0x302010000000000, 0x0 <repeats 30 times>}, u = {0x0, 0x302010000000000,
0x0 <repeats 30 times>}, s = {0x0, 0x302010000000000,
0x0 <repeats 30 times>}}, s = {f = {0x0, 0x0, 0x3020100, 0x0
<repeats 61 times>}, u = {0x0, 0x0, 0x3020100, 0x0 <repeats 61 times>},
s = {0x0, 0x0, 0x3020100, 0x0 <repeats 61 times>}}, h = {f = {0x0, 0x0,
0x0, 0x0, 0x302, 0x100, 0x0 <repeats 122 times>}, u = {0x0, 0x0, 0x0,
0x0, 0x302, 0x100, 0x0 <repeats 122 times>}, s = {0x0, 0x0, 0x0, 0x0,
0x302, 0x100, 0x0 <repeats 122 times>}}, b = {u = {0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1, 0x0 <repeats 245 times>}, s = {0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1, 0x0 <repeats 245 times>
}}}
Note, in the case of SVE, this issue is also present when the host
and target are both little endian. Consider the GDB remote session snippet below
showcasing this:
(gdb) p/x $z0
$6 = {q = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}}, d =
{f = {0x0 <repeats 32 times>}, u = {0x0 <repeats 32 times>}, s = {
0x0 <repeats 32 times>}}, s = {f = {0x0 <repeats 64 times>}, u = {
0x0 <repeats 64 times>}, s = {0x0 <repeats 64 times>}}, h = {f = {
0x0 <repeats 128 times>}, u = {0x0 <repeats 128 times>}, s = {
0x0 <repeats 128 times>}}, b = {u = {0x0 <repeats 256 times>}, s = {
0x0 <repeats 256 times>}}}
(gdb) set $z0.q.u[0] = 0x010203
(gdb) p/x $z0
$7 = {q = {u = {0x102030000000000000000, 0x0 <repeats 15 times>}, s = {
0x102030000000000000000, 0x0 <repeats 15 times>}}, d = {f = {0x0,
0x10203,
0x0 <repeats 30 times>}, u = {0x0, 0x10203, 0x0 <repeats 30 times>},
s = {0x0,
0x10203, 0x0 <repeats 30 times>}}, s = {f = {0x0, 0x0, 0x10203,
0x0 <repeats 61 times>}, u = {0x0, 0x0, 0x10203, 0x0
<repeats 61 times>}, s = {0x0, 0x0, 0x10203, 0x0 <repeats 61 times>}},
h = {f = {0x0, 0x0, 0x0, 0x0, 0x203, 0x1, 0x0 <repeats 122 times>},
u = {0x0, 0x0, 0x0, 0x0, 0x203, 0x1, 0x0 <repeats 122 times>},
s = {0x0, 0x0, 0x0, 0x0, 0x203, 0x1, 0x0 <repeats 122 times>}},
b = {u = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1,
0x0 <repeats 245 times>}, s = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x3, 0x2, 0x1, 0x0 <repeats 245 times>}}}
In all scenarios, the value returning on getting the register after setting it
to 0x010203 is not preserved in appropriate byte order and hence does not
print 0x010203 as expected.
The current implementation for the SIMD functionality for getting and setting
registers via the gdbstub is implemented as follows:
aarch64_gdb_set_fpu_reg:
<omitted code>
uint64_t *q = aa64_vfp_qreg(env, reg);
q[0] = ldq_le_p(buf);
q[1] = ldq_le_p(buf + 8);
return 16;
<omitted code>
The following code is a suggested fix for the current implementation that
should allow for mixed endian support for getting/setting SIMD registers
via the remote GDB protocol.
aarch64_gdb_set_fpu_reg:
<omitted code>
// case 0...31
uint64_t *q = aa64_vfp_qreg(env, reg);
if (target_big_endian()){
q[1] = ldq_p(buf);
q[0] = ldq_p(buf + 8);
}
else{
q[0] = ldq_p(buf);
q[1] = ldq_p(buf + 8);
}
return 16;
<omitted code>
This use of ldq_p rather than ldq_le_p (which the current implementation
uses) to load bytes into host endian struct is inspired by the current
implementation for getting/setting general purpose registers via remote
GDB (which works appropriately regardless of target endianness), as well
as the current implementation for getting/setting gprs via GDB with ppc
as a target (refer to ppc_cpu_gdb_write_register() for example). Note the
the order of setting q[0] and q[1] is suggested to be swapped for big
endian targets to ensure that q[1] always holds the most significant half
and q[0] always holds the least significant half (refer to the comment
in target/arm/cpu.h, line 155).
For SVE, the current implementation is as follows for the zregs:
aarch64_gdb_set_sve_reg:
<omitted code>
// case 0...31
int vq, len = 0;
uint64_t *p = (uint64_t *) buf;
for (vq = 0; vq < cpu->sve_max_vq; vq++) {
env->vfp.zregs[reg].d[vq * 2 + 1] = *p++;
env->vfp.zregs[reg].d[vq * 2] = *p++;
len += 16;
}
return len;
The suggestion here is similar to the one above for SIMD, that ldq_p
should be used rather than simple pointer dereferencing. This suggestion
aims to allow the QEMU gdbstub to support getting/setting register values
correctly regardless of the target endianness. This suggestion aims to
yield results such as the following from a remote GDB session, regardless
of target endianness:
(gdb) p/x $z0
$1 = {q = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}},
d = {f = {0x0 <repeats 32 times>}, u = {0x0 <repeats 32 times>},
s = {0x0 <repeats 32 times>}}, s = {f = {0x0 <repeats 64 times>},
u = {0x0 <repeats 64 times>}, s = {0x0 <repeats 64 times>}}, h = {f = {
0x0 <repeats 128 times>}, u = {0x0 <repeats 128 times>}, s = {
0x0 <repeats 128 times>}}, b = {u = {0x0 <repeats 256 times>}, s = {
0x0 <repeats 256 times>}}}
(gdb) set $z0.q.u[0] = 0x010203
(gdb) p/x $z0
$2 = {q = {u = {0x10203, 0x0 <repeats 15 times>}, s = {0x10203,
0x0 <repeats 15 times>}}, d = {f = {0x10203, 0x0
<repeats 31 times>},u = {0x10203, 0x0 <repeats 31 times>},
s = {0x10203, 0x0 <repeats 31 times>}}, s = {f = {0x10203,
0x0 <repeats 63 times>}, u = {0x10203, 0x0 <repeats 63 times>},
s = {0x10203, 0x0 <repeats 63 times>}}, h = {f = {0x203, 0x1,
0x0 <repeats 126 times>}, u = {0x203, 0x1, 0x0 <repeats 126 times>},
s = {0x203, 0x1, 0x0 <repeats 126 times>}}, b = {u = {0x3, 0x2, 0x1,
0x0 <repeats 253 times>}, s = {0x3, 0x2, 0x1, 0x0 <repeats 253 times>}}}
The first patch will implement this change for NEON registers,
and the second patch will do so for SVE registers.
Glenn Miles (12):
ppc/xive2: Fix calculation of END queue sizes
ppc/xive2: Use fair irq target search algorithm
ppc/xive2: Fix irq preempted by lower priority group irq
ppc/xive2: Fix treatment of PIPR in CPPR update
pnv/xive2: Support ESB Escalation
ppc/xive2: add interrupt priority configuration flags
ppc/xive2: Support redistribution of group interrupts
ppc/xive: Add more interrupt notification tracing
ppc/xive2: Improve pool regs variable name
ppc/xive2: Implement "Ack OS IRQ to even report line" TIMA op
ppc/xive2: Redistribute group interrupt precluded by CPPR update
ppc/xive2: redistribute irqs for pool and phys ctx pull
Michael Kowal (4):
ppc/xive2: Remote VSDs need to match on forwarding address
ppc/xive2: Reset Generation Flipped bit on END Cache Watch
pnv/xive2: Print value in invalid register write logging
pnv/xive2: Permit valid writes to VC/PC Flush Control registers
Nicholas Piggin (34):
ppc/xive: Fix xive trace event output
ppc/xive: Report access size in XIVE TM operation error logs
ppc/xive2: fix context push calculation of IPB priority
ppc/xive: Fix PHYS NSR ring matching
ppc/xive2: Do not present group interrupt on OS-push if precluded by
CPPR
ppc/xive2: Set CPPR delivery should account for group priority
ppc/xive: tctx_notify should clear the precluded interrupt
ppc/xive: Explicitly zero NSR after accepting
ppc/xive: Move NSR decoding into helper functions
ppc/xive: Fix pulling pool and phys contexts
pnv/xive2: VC_ENDC_WATCH_SPEC regs should read back WATCH_FULL
ppc/xive: Change presenter .match_nvt to match not present
ppc/xive2: Redistribute group interrupt preempted by higher priority
interrupt
ppc/xive: Add xive_tctx_pipr_present() to present new interrupt
ppc/xive: Fix high prio group interrupt being preempted by low prio VP
ppc/xive: Split xive recompute from IPB function
ppc/xive: tctx signaling registers rework
ppc/xive: tctx_accept only lower irq line if an interrupt was
presented
ppc/xive: Add xive_tctx_pipr_set() helper function
ppc/xive2: split tctx presentation processing from set CPPR
ppc/xive2: Consolidate presentation processing in context push
ppc/xive2: Avoid needless interrupt re-check on CPPR set
ppc/xive: Assert group interrupts were redistributed
ppc/xive2: implement NVP context save restore for POOL ring
ppc/xive2: Prevent pulling of pool context losing phys interrupt
ppc/xive: Redistribute phys after pulling of pool context
ppc/xive: Check TIMA operations validity
ppc/xive2: Implement pool context push TIMA op
ppc/xive2: redistribute group interrupts on context push
ppc/xive2: Implement set_os_pending TIMA op
ppc/xive2: Implement POOL LGS push TIMA op
ppc/xive2: Implement PHYS ring VP push TIMA op
ppc/xive: Split need_resend into restore_nvp
ppc/xive2: Enable lower level contexts on VP push
Vacha Bhavsar (2):
This patch adds big endian support for NEON GDB remote debugging. It
replaces the use of ldq_le_p() with the use of ldq_p().
Additionally, it checks the target endianness to ensure the most
significant bits are always in second element.
This patch adds big endian support for SVE GDB remote debugging. It
replaces the use of pointer dereferencing with the use of ldq_p().
Additionally, it checks the target endianness to ensure the most
significant bits are always in second element.
--
2.34.1
On Tue, 22 Jul 2025 at 18:37, Vacha Bhavsar <vacha.bhavsar@oss.qualcomm.com> wrote: > > Upon examining the current implementation for getting/setting SIMD > and SVE registers via remote GDB, there is a concern about mixed > endian support. This patch series aims to address this concern and > allow getting and setting the values of NEON and SVE registers via > remote GDB regardless of the target endianness. Thanks; I've applied these patches to target-arm.next (with a bit of tweaking of the commit messages). Something seems to have gone wrong with the creation of this cover letter, by the way: it lists a lot of patches that aren't in it. > Glenn Miles (12): > ppc/xive2: Fix calculation of END queue sizes > ppc/xive2: Use fair irq target search algorithm > ppc/xive2: Fix irq preempted by lower priority group irq [etc] thanks -- PMM
Thank you for the update! Apologies for the error in the cover letter. On Fri, Aug 1, 2025 at 10:01 AM Peter Maydell <peter.maydell@linaro.org> wrote: > On Tue, 22 Jul 2025 at 18:37, Vacha Bhavsar > <vacha.bhavsar@oss.qualcomm.com> wrote: > > > > Upon examining the current implementation for getting/setting SIMD > > and SVE registers via remote GDB, there is a concern about mixed > > endian support. This patch series aims to address this concern and > > allow getting and setting the values of NEON and SVE registers via > > remote GDB regardless of the target endianness. > > Thanks; I've applied these patches to target-arm.next (with > a bit of tweaking of the commit messages). > > Something seems to have gone wrong with the creation of > this cover letter, by the way: it lists a lot of > patches that aren't in it. > > > Glenn Miles (12): > > ppc/xive2: Fix calculation of END queue sizes > > ppc/xive2: Use fair irq target search algorithm > > ppc/xive2: Fix irq preempted by lower priority group irq > > [etc] > > thanks > -- PMM >
© 2016 - 2025 Red Hat, Inc.