include/qemu/bitops.h | 13 + include/qemu/threaded-workqueue.h | 94 +++++++ migration/ram.c | 538 ++++++++++++++------------------------ tests/Makefile.include | 5 +- tests/threaded-workqueue-bench.c | 256 ++++++++++++++++++ util/Makefile.objs | 1 + util/threaded-workqueue.c | 466 +++++++++++++++++++++++++++++++++ 7 files changed, 1030 insertions(+), 343 deletions(-) create mode 100644 include/qemu/threaded-workqueue.h create mode 100644 tests/threaded-workqueue-bench.c create mode 100644 util/threaded-workqueue.c
From: Xiao Guangrong <xiaoguangrong@tencent.com>
Changelog in v2:
These changes are based on Paolo's suggestion:
1) rename the lockless multithreads model to threaded workqueue
2) hugely improve the internal design, that make all the request be
a large array, properly partition it, assign requests to threads
respectively and use bitmaps to sync up threads and the submitter,
after that ptr_ring and spinlock are dropped
3) introduce event wait for the submitter
These changes are based on Emilio's review:
4) make more detailed description for threaded workqueue
5) add a benchmark for threaded workqueue
The previous version can be found at
https://marc.info/?l=kvm&m=153968821910007&w=2
There's the simple performance measurement comparing these two versions,
the environment is the same as we listed in the previous version.
Use 8 threads to compress the data in the source QEMU
- with compress-wait-thread = off
total time busy-ratio
--------------------------------------------------
v1 125066 0.38
v2 120444 0.35
- with compress-wait-thread = on
total time busy-ratio
--------------------------------------------------
v1 164426 0
v2 142609 0
The v2 win slightly.
Xiao Guangrong (5):
bitops: introduce change_bit_atomic
util: introduce threaded workqueue
migration: use threaded workqueue for compression
migration: use threaded workqueue for decompression
tests: add threaded-workqueue-bench
include/qemu/bitops.h | 13 +
include/qemu/threaded-workqueue.h | 94 +++++++
migration/ram.c | 538 ++++++++++++++------------------------
tests/Makefile.include | 5 +-
tests/threaded-workqueue-bench.c | 256 ++++++++++++++++++
util/Makefile.objs | 1 +
util/threaded-workqueue.c | 466 +++++++++++++++++++++++++++++++++
7 files changed, 1030 insertions(+), 343 deletions(-)
create mode 100644 include/qemu/threaded-workqueue.h
create mode 100644 tests/threaded-workqueue-bench.c
create mode 100644 util/threaded-workqueue.c
--
2.14.5
Hi, Ping... On 11/6/18 8:20 PM, guangrong.xiao@gmail.com wrote: > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > Changelog in v2: > These changes are based on Paolo's suggestion: > 1) rename the lockless multithreads model to threaded workqueue > 2) hugely improve the internal design, that make all the request be > a large array, properly partition it, assign requests to threads > respectively and use bitmaps to sync up threads and the submitter, > after that ptr_ring and spinlock are dropped > 3) introduce event wait for the submitter > > These changes are based on Emilio's review: > 4) make more detailed description for threaded workqueue > 5) add a benchmark for threaded workqueue > > The previous version can be found at > https://marc.info/?l=kvm&m=153968821910007&w=2 > > There's the simple performance measurement comparing these two versions, > the environment is the same as we listed in the previous version. > > Use 8 threads to compress the data in the source QEMU > - with compress-wait-thread = off > > > total time busy-ratio > -------------------------------------------------- > v1 125066 0.38 > v2 120444 0.35 > > - with compress-wait-thread = on > total time busy-ratio > -------------------------------------------------- > v1 164426 0 > v2 142609 0 > > The v2 win slightly. > > Xiao Guangrong (5): > bitops: introduce change_bit_atomic > util: introduce threaded workqueue > migration: use threaded workqueue for compression > migration: use threaded workqueue for decompression > tests: add threaded-workqueue-bench > > include/qemu/bitops.h | 13 + > include/qemu/threaded-workqueue.h | 94 +++++++ > migration/ram.c | 538 ++++++++++++++------------------------ > tests/Makefile.include | 5 +- > tests/threaded-workqueue-bench.c | 256 ++++++++++++++++++ > util/Makefile.objs | 1 + > util/threaded-workqueue.c | 466 +++++++++++++++++++++++++++++++++ > 7 files changed, 1030 insertions(+), 343 deletions(-) > create mode 100644 include/qemu/threaded-workqueue.h > create mode 100644 tests/threaded-workqueue-bench.c > create mode 100644 util/threaded-workqueue.c >
On 12/11/18 04:07, Xiao Guangrong wrote: > > Hi, > > Ping... Hi Guangrong, I think this isn't being reviewed because we're in freeze. Paolo > On 11/6/18 8:20 PM, guangrong.xiao@gmail.com wrote: >> From: Xiao Guangrong <xiaoguangrong@tencent.com> >> >> Changelog in v2: >> These changes are based on Paolo's suggestion: >> 1) rename the lockless multithreads model to threaded workqueue >> 2) hugely improve the internal design, that make all the request be >> a large array, properly partition it, assign requests to threads >> respectively and use bitmaps to sync up threads and the submitter, >> after that ptr_ring and spinlock are dropped >> 3) introduce event wait for the submitter >> >> These changes are based on Emilio's review: >> 4) make more detailed description for threaded workqueue >> 5) add a benchmark for threaded workqueue >> >> The previous version can be found at >> https://marc.info/?l=kvm&m=153968821910007&w=2 >> >> There's the simple performance measurement comparing these two versions, >> the environment is the same as we listed in the previous version. >> >> Use 8 threads to compress the data in the source QEMU >> - with compress-wait-thread = off >> >> >> total time busy-ratio >> -------------------------------------------------- >> v1 125066 0.38 >> v2 120444 0.35 >> >> - with compress-wait-thread = on >> total time busy-ratio >> -------------------------------------------------- >> v1 164426 0 >> v2 142609 0 >> >> The v2 win slightly. >> >> Xiao Guangrong (5): >> bitops: introduce change_bit_atomic >> util: introduce threaded workqueue >> migration: use threaded workqueue for compression >> migration: use threaded workqueue for decompression >> tests: add threaded-workqueue-bench >> >> include/qemu/bitops.h | 13 + >> include/qemu/threaded-workqueue.h | 94 +++++++ >> migration/ram.c | 538 >> ++++++++++++++------------------------ >> tests/Makefile.include | 5 +- >> tests/threaded-workqueue-bench.c | 256 ++++++++++++++++++ >> util/Makefile.objs | 1 + >> util/threaded-workqueue.c | 466 >> +++++++++++++++++++++++++++++++++ >> 7 files changed, 1030 insertions(+), 343 deletions(-) >> create mode 100644 include/qemu/threaded-workqueue.h >> create mode 100644 tests/threaded-workqueue-bench.c >> create mode 100644 util/threaded-workqueue.c >>
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20181106122025.3487-1-xiaoguangrong@tencent.com
Subject: [Qemu-devel] [PATCH v2 0/5] migration: improve multithreads
=== TEST SCRIPT BEGIN ===
#!/bin/bash
BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done
exit $failed
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
64aabd11be tests: add threaded-workqueue-bench
e9efc70edd migration: use threaded workqueue for decompression
7753a846ad migration: use threaded workqueue for compression
f6933b5132 util: introduce threaded workqueue
810caaa566 bitops: introduce change_bit_atomic
=== OUTPUT BEGIN ===
Checking PATCH 1/5: bitops: introduce change_bit_atomic...
Checking PATCH 2/5: util: introduce threaded workqueue...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#41:
new file mode 100644
total: 0 errors, 1 warnings, 566 lines checked
Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 3/5: migration: use threaded workqueue for compression...
Checking PATCH 4/5: migration: use threaded workqueue for decompression...
Checking PATCH 5/5: tests: add threaded-workqueue-bench...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#36:
new file mode 100644
WARNING: line over 80 characters
#237: FILE: tests/threaded-workqueue-bench.c:197:
+ printf(" -r: the number of requests handled by each thread (default %d).\n",
WARNING: line over 80 characters
#239: FILE: tests/threaded-workqueue-bench.c:199:
+ printf(" -m: the size of the memory (G) used to test (default %dG).\n",
ERROR: line over 90 characters
#285: FILE: tests/threaded-workqueue-bench.c:245:
+ printf("Run the benchmark: threads %d requests-per-thread: %d memory %ldG repeat %d.\n",
total: 1 errors, 3 warnings, 273 lines checked
Your patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===
Test command exited with code: 1
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
© 2016 - 2025 Red Hat, Inc.