Support for dma_map_sg, add option '-m' to distinguish mode.
i) Users can set option '-m' to select mode:
DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1
(The mode is also show in the test result).
ii) Users can set option '-g' to set sg_nents
(total count of entries in scatterlist)
the maximum number is 1024. Each of sg buf size is PAGE_SIZE.
e.g
[root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
dma mapping mode: DMA_MAP_BENCH_SG_MODE
dma mapping benchmark: threads:8 seconds:30 node:-1
dir:FROM_DEVICE granule/sg_nents: 8
average map latency(us):1.4 standard deviation:0.3
average unmap latency(us):1.3 standard deviation:0.3
[root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE
dma mapping benchmark: threads:8 seconds:30 node:-1
dir:FROM_DEVICE granule/sg_nents: 8
average map latency(us):1.0 standard deviation:0.3
average unmap latency(us):1.3 standard deviation:0.5
Reviewed-by: Barry Song <baohua@kernel.org>
Signed-off-by: Qinxin Xia <xiaqinxin@huawei.com>
---
tools/dma/dma_map_benchmark.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/tools/dma/dma_map_benchmark.c b/tools/dma/dma_map_benchmark.c
index dd0ed528e6df..143ca8dab8af 100644
--- a/tools/dma/dma_map_benchmark.c
+++ b/tools/dma/dma_map_benchmark.c
@@ -20,12 +20,19 @@ static char *directions[] = {
"FROM_DEVICE",
};
+static char *mode[] = {
+ "SINGLE_MODE",
+ "SG_MODE",
+};
+
int main(int argc, char **argv)
{
struct map_benchmark map;
int fd, opt;
/* default single thread, run 20 seconds on NUMA_NO_NODE */
int threads = 1, seconds = 20, node = -1;
+ /* default single map mode */
+ int map_mode = DMA_MAP_BENCH_SINGLE_MODE;
/* default dma mask 32bit, bidirectional DMA */
int bits = 32, xdelay = 0, dir = DMA_MAP_BIDIRECTIONAL;
/* default granule 1 PAGESIZE */
@@ -33,7 +40,7 @@ int main(int argc, char **argv)
int cmd = DMA_MAP_BENCHMARK;
- while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:")) != -1) {
+ while ((opt = getopt(argc, argv, "t:s:n:b:d:x:g:m:")) != -1) {
switch (opt) {
case 't':
threads = atoi(optarg);
@@ -56,11 +63,20 @@ int main(int argc, char **argv)
case 'g':
granule = atoi(optarg);
break;
+ case 'm':
+ map_mode = atoi(optarg);
+ break;
default:
return -1;
}
}
+ if (map_mode < 0 || map_mode >= DMA_MAP_BENCH_MODE_MAX) {
+ fprintf(stderr, "invalid map mode, SINGLE_MODE:%d, SG_MODE: %d\n",
+ DMA_MAP_BENCH_SINGLE_MODE, DMA_MAP_BENCH_SG_MODE);
+ exit(1);
+ }
+
if (threads <= 0 || threads > DMA_MAP_MAX_THREADS) {
fprintf(stderr, "invalid number of threads, must be in 1-%d\n",
DMA_MAP_MAX_THREADS);
@@ -110,14 +126,15 @@ int main(int argc, char **argv)
map.dma_dir = dir;
map.dma_trans_ns = xdelay;
map.granule = granule;
+ map.map_mode = map_mode;
if (ioctl(fd, cmd, &map)) {
perror("ioctl");
exit(1);
}
- printf("dma mapping benchmark: threads:%d seconds:%d node:%d dir:%s granule: %d\n",
- threads, seconds, node, directions[dir], granule);
+ printf("dma mapping benchmark(%s): threads:%d seconds:%d node:%d dir:%s granule:%d\n",
+ mode[map_mode], threads, seconds, node, directions[dir], granule);
printf("average map latency(us):%.1f standard deviation:%.1f\n",
map.avg_map_100ns/10.0, map.map_stddev/10.0);
printf("average unmap latency(us):%.1f standard deviation:%.1f\n",
--
2.33.0
On Mon, Jan 12, 2026 at 5:34 PM Qinxin Xia <xiaqinxin@huawei.com> wrote: > > Support for dma_map_sg, add option '-m' to distinguish mode. > > i) Users can set option '-m' to select mode: > DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1 > (The mode is also show in the test result). > ii) Users can set option '-g' to set sg_nents > (total count of entries in scatterlist) > the maximum number is 1024. Each of sg buf size is PAGE_SIZE. > e.g > [root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2 > dma mapping mode: DMA_MAP_BENCH_SG_MODE > dma mapping benchmark: threads:8 seconds:30 node:-1 > dir:FROM_DEVICE granule/sg_nents: 8 > average map latency(us):1.4 standard deviation:0.3 > average unmap latency(us):1.3 standard deviation:0.3 > [root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2 > dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE > dma mapping benchmark: threads:8 seconds:30 node:-1 > dir:FROM_DEVICE granule/sg_nents: 8 > average map latency(us):1.0 standard deviation:0.3 > average unmap latency(us):1.3 standard deviation:0.5 > What happens if m is set to 0 while g is set to 8? Thanks Barry
On 2026/1/26 10:51:11, Barry Song <21cnbao@gmail.com> wrote:
> On Mon, Jan 12, 2026 at 5:34 PM Qinxin Xia <xiaqinxin@huawei.com> wrote:
>>
>> Support for dma_map_sg, add option '-m' to distinguish mode.
>>
>> i) Users can set option '-m' to select mode:
>> DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1
>> (The mode is also show in the test result).
>> ii) Users can set option '-g' to set sg_nents
>> (total count of entries in scatterlist)
>> the maximum number is 1024. Each of sg buf size is PAGE_SIZE.
>> e.g
>> [root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
>> dma mapping mode: DMA_MAP_BENCH_SG_MODE
>> dma mapping benchmark: threads:8 seconds:30 node:-1
>> dir:FROM_DEVICE granule/sg_nents: 8
>> average map latency(us):1.4 standard deviation:0.3
>> average unmap latency(us):1.3 standard deviation:0.3
>> [root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
>> dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE
>> dma mapping benchmark: threads:8 seconds:30 node:-1
>> dir:FROM_DEVICE granule/sg_nents: 8
>> average map latency(us):1.0 standard deviation:0.3
>> average unmap latency(us):1.3 standard deviation:0.5
>>
>
> What happens if m is set to 0 while g is set to 8?
>
> Thanks
> Barry
Hi Barry!
m set '0' and g set '8', This means that 8 page_sizes are mapped at a
time in single mode.
As the comment for the struct map_benchmark definition says:
__u32 granule; /* how many PAGE_SIZE will do map/unmap once a time */
[root@localhost xqx]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
dma mapping benchmark(SINGLE_MODE): threads:8 seconds:30 node:-1
dir:FROM_DEVICE granule:8
average map latency(us):0.2 standard deviation:0.1
average unmap latency(us):4.3 standard deviation:1.4
======================================================
The newly added sg mode reuses the -g option as sgnents and is described
in the comments:
/*
* Set the number of scatterlist entries based on the granule.
* In SG mode, 'granule' represents the number of scatterlist
entries.
* Each scatterlist entry corresponds to a single page.
*/
By the way, I've considered testing sgnents of different sizes, but it's
not very easy to set for user parameters, so I set it with each
scatterlist entry corresponds to a single page.
Thanks,
Qinxin
On Fri, Jan 30, 2026 at 4:38 PM Qinxin Xia <xiaqinxin@huawei.com> wrote:
>
>
>
> On 2026/1/26 10:51:11, Barry Song <21cnbao@gmail.com> wrote:
> > On Mon, Jan 12, 2026 at 5:34 PM Qinxin Xia <xiaqinxin@huawei.com> wrote:
> >>
> >> Support for dma_map_sg, add option '-m' to distinguish mode.
> >>
> >> i) Users can set option '-m' to select mode:
> >> DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1
> >> (The mode is also show in the test result).
> >> ii) Users can set option '-g' to set sg_nents
> >> (total count of entries in scatterlist)
> >> the maximum number is 1024. Each of sg buf size is PAGE_SIZE.
> >> e.g
> >> [root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2
> >> dma mapping mode: DMA_MAP_BENCH_SG_MODE
> >> dma mapping benchmark: threads:8 seconds:30 node:-1
> >> dir:FROM_DEVICE granule/sg_nents: 8
> >> average map latency(us):1.4 standard deviation:0.3
> >> average unmap latency(us):1.3 standard deviation:0.3
> >> [root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
> >> dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE
> >> dma mapping benchmark: threads:8 seconds:30 node:-1
> >> dir:FROM_DEVICE granule/sg_nents: 8
> >> average map latency(us):1.0 standard deviation:0.3
> >> average unmap latency(us):1.3 standard deviation:0.5
> >>
> >
> > What happens if m is set to 0 while g is set to 8?
> >
> > Thanks
> > Barry
>
> Hi Barry!
> m set '0' and g set '8', This means that 8 page_sizes are mapped at a
> time in single mode.
> As the comment for the struct map_benchmark definition says:
>
> __u32 granule; /* how many PAGE_SIZE will do map/unmap once a time */
>
> [root@localhost xqx]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2
> dma mapping benchmark(SINGLE_MODE): threads:8 seconds:30 node:-1
> dir:FROM_DEVICE granule:8
> average map latency(us):0.2 standard deviation:0.1
> average unmap latency(us):4.3 standard deviation:1.4
>
> ======================================================
> The newly added sg mode reuses the -g option as sgnents and is described
> in the comments:
> /*
> * Set the number of scatterlist entries based on the granule.
>
>
> * In SG mode, 'granule' represents the number of scatterlist
> entries.
> * Each scatterlist entry corresponds to a single page.
> */
>
> By the way, I've considered testing sgnents of different sizes, but it's
> not very easy to set for user parameters, so I set it with each
> scatterlist entry corresponds to a single page.
This is a bit odd. Ideally, we shouldn’t have a mixed definition
for a single variant, but since this is just a tool, it may be
acceptable.
That said, the documentation should at least be updated in
patches 2/3 and 3/3. As it stands, it still says:
__u32 granule; /* how many PAGE_SIZE are mapped or unmapped
at a time */
>
> Thanks,
> Qinxin
>
On 2026/1/30 17:16:08, Barry Song <21cnbao@gmail.com> wrote: > On Fri, Jan 30, 2026 at 4:38 PM Qinxin Xia <xiaqinxin@huawei.com> wrote: >> >> >> >> On 2026/1/26 10:51:11, Barry Song <21cnbao@gmail.com> wrote: >>> On Mon, Jan 12, 2026 at 5:34 PM Qinxin Xia <xiaqinxin@huawei.com> wrote: >>>> >>>> Support for dma_map_sg, add option '-m' to distinguish mode. >>>> >>>> i) Users can set option '-m' to select mode: >>>> DMA_MAP_BENCH_SINGLE_MODE=0, DMA_MAP_BENCH_SG_MODE:=1 >>>> (The mode is also show in the test result). >>>> ii) Users can set option '-g' to set sg_nents >>>> (total count of entries in scatterlist) >>>> the maximum number is 1024. Each of sg buf size is PAGE_SIZE. >>>> e.g >>>> [root@localhost]# ./dma_map_benchmark -m 1 -g 8 -t 8 -s 30 -d 2 >>>> dma mapping mode: DMA_MAP_BENCH_SG_MODE >>>> dma mapping benchmark: threads:8 seconds:30 node:-1 >>>> dir:FROM_DEVICE granule/sg_nents: 8 >>>> average map latency(us):1.4 standard deviation:0.3 >>>> average unmap latency(us):1.3 standard deviation:0.3 >>>> [root@localhost]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2 >>>> dma mapping mode: DMA_MAP_BENCH_SINGLE_MODE >>>> dma mapping benchmark: threads:8 seconds:30 node:-1 >>>> dir:FROM_DEVICE granule/sg_nents: 8 >>>> average map latency(us):1.0 standard deviation:0.3 >>>> average unmap latency(us):1.3 standard deviation:0.5 >>>> >>> >>> What happens if m is set to 0 while g is set to 8? >>> >>> Thanks >>> Barry >> >> Hi Barry! >> m set '0' and g set '8', This means that 8 page_sizes are mapped at a >> time in single mode. >> As the comment for the struct map_benchmark definition says: >> >> __u32 granule; /* how many PAGE_SIZE will do map/unmap once a time */ >> >> [root@localhost xqx]# ./dma_map_benchmark -m 0 -g 8 -t 8 -s 30 -d 2 >> dma mapping benchmark(SINGLE_MODE): threads:8 seconds:30 node:-1 >> dir:FROM_DEVICE granule:8 >> average map latency(us):0.2 standard deviation:0.1 >> average unmap latency(us):4.3 standard deviation:1.4 >> >> ====================================================== >> The newly added sg mode reuses the -g option as sgnents and is described >> in the comments: >> /* >> * Set the number of scatterlist entries based on the granule. >> >> >> * In SG mode, 'granule' represents the number of scatterlist >> entries. >> * Each scatterlist entry corresponds to a single page. >> */ >> >> By the way, I've considered testing sgnents of different sizes, but it's >> not very easy to set for user parameters, so I set it with each >> scatterlist entry corresponds to a single page. > > This is a bit odd. Ideally, we shouldn’t have a mixed definition > for a single variant, but since this is just a tool, it may be > acceptable. > > That said, the documentation should at least be updated in > patches 2/3 and 3/3. As it stands, it still says: > > __u32 granule; /* how many PAGE_SIZE are mapped or unmapped > at a time */ > > >> >> Thanks, >> Qinxin >> OK, I will update the documentation in the next version. Do you have any other suggestions for this series? -- Thanks, Qinxin
© 2016 - 2026 Red Hat, Inc.