This version 3 of the patch adds endianness safety to both the optimizations
brought by the patch set.
It also adds some conditions that allow the __builtin_memcpy to be executed
on chunks of 16 bytes with guarantee of atomicity.
Changes from V2:
- patch 1:
- add condition for the host not to be big endian.
- patch 2:
- add condition for the host not to be big endian.
- add condition for the host to support 16-byte atomic memory operations.
- limit the large loads and stores to 16 byte chunks in order to guarantee
atomicity on a larger range of processors.
Cc: Richard Handerson <richard.henderson@linaro.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Alistair Francis <alistair.francis@wdc.com>
Cc: Bin Meng <bmeng.cn@gmail.com>
Cc: Weiwei Li <liwei1518@gmail.com>
Cc: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Cc: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Cc: Helene Chelin <helene.chelin@embecosm.com>
Cc: Nathan Egge <negge@google.com>
Cc: Max Chou <max.chou@sifive.com>
Helene CHELIN (1):
target/riscv: rvv: reduce the overhead for simple RISC-V vector
unit-stride loads and stores
Paolo Savini (1):
target/riscv: rvv: improve performance of RISC-V vector loads and
stores on large amounts of data.
target/riscv/vector_helper.c | 61 +++++++++++++++++++++++++++++++++++-
1 file changed, 60 insertions(+), 1 deletion(-)
--
2.34.1