From nobody Mon Jun 15 06:35:44 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E81EE335063; Wed, 8 Apr 2026 14:25:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775658340; cv=none; b=XgpcfCBgE/iWbNldUT/ucUHGqUds8/N4YIbE/RacTMGjDRPw1c4NfdqHJvQC9A4xzfjps5w4qzMZqp/Pp26Qb8HMNLNsi/gI2l9DRXeizf/T0i5Z1y2LUY3ff0XQvUp4B5hlEChDHR3HIb6Rsv9zDPrEgEUGw2JKmAZbyhBPdvM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775658340; c=relaxed/simple; bh=qp5bMkCPIicW0vAK3EBrW1A8yDhhLZgvKi1mDE1KQ+Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XG7uDAkccdl5GopxoHV4kTo3x4IaD/AamTD2DURn87yMiR8+8UHyy9bHTmRIUPLvU7LD3Fbv7NSz8Fp1YWxPjOGGM+tT8Gkz80jn4A855vjfjIcjUSu8dqfNSPQqZNj8oeeMMkVBd5BlEWhOZYIdoWAIc/U+fS23GuOiHXC5fsI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=E8w13Z7t; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="E8w13Z7t" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E9ED3C2BC87; Wed, 8 Apr 2026 14:25:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775658339; bh=qp5bMkCPIicW0vAK3EBrW1A8yDhhLZgvKi1mDE1KQ+Q=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=E8w13Z7tZwKG3xeGLv3x0jmBMpYPL3PfUJqgccOnYQntg7oRsXpq3Q1t78OijpZ8b Ca8VucyWJXLtN7bk4oIHrtfk1nRiMy98z76ztseABwbrI7VWqgx63gVuvok3Fq6UAS DpizIlGIwsbCGhgG9ocHvr1kt1MH8Nf/byIxoAqN0Ee1H8bAIPjZQxuDBEPb0haUJl 3dsxEMFie5LIacV5bRYEGGQIkNQ1FiFaNqnPvEcDQz2X4ZaTtAHxJQKRRHDYXQhe6G 2RDmzAlJGv1Pe0gZrA0kz9Q7mO/7LI87CS6hFgOnvH623sNqHPMw7nzjYqCOHM7pHJ mUpzFxnthSzbA== From: Jeff Layton Date: Wed, 08 Apr 2026 10:25:21 -0400 Subject: [PATCH v2 1/3] mm: kick writeback flusher instead of inline flush for IOCB_DONTCACHE Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260408-dontcache-v2-1-948dec1e756b@kernel.org> References: <20260408-dontcache-v2-0-948dec1e756b@kernel.org> In-Reply-To: <20260408-dontcache-v2-0-948dec1e756b@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , "Matthew Wilcox (Oracle)" , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Mike Snitzer , Jens Axboe , Chuck Lever Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=3840; i=jlayton@kernel.org; h=from:subject:message-id; bh=qp5bMkCPIicW0vAK3EBrW1A8yDhhLZgvKi1mDE1KQ+Q=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBp1mVfpEPxZ1DKFWwnUBWNtGPYod8zrgho1SpyC EpBrB6x/oSJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCadZlXwAKCRAADmhBGVaC FSd8D/97DxdSGpMqJRUTtYA7ZwjJxYqUSrSZuP3a8ITeFMIlQhUU4vnFN5vV4/z0DgAtoFwkzfk l4wCk29KBgfa+nrGhWlO0ndZakYnqFP61L30vVSn0QAJm95TsuAUmkRFup1B9Vbstw1QCKcOzez XeD9UU8gs1Ft0afeuZz05T/U5ymF24ZwN3oWIYcbPYsTLDzUnrCE7ahsvfdmJKH4Tt+G+kBr2EO 4YYYx02yAdnDP3z/65GpIdtVZVBxqcDWeXI6sn23BmopzDhHUCokatLqoM5GIVBC1oV3iQMQ2zE hmXMKtyjMzb1nJsWhfIXTs0yrUUdAn5EfNv0WJqTM9ZWor42LccpyfZHRhC0k02VAGMJAvTqhVl 4enQn4c2S9N6AwN3T8NFUAfIM9qViPK7Q+0dT5/0ar89P8Gzy42+znK8sJNrJPCbPKG9C0KUtPa eFFZmxvf6wf3iwF4UjdDRUh146QSctEy/n2/XFgQiZNfljJY1XW4/W5A8oblTi+kabmotpvFq2n qEoa/KM/JEouKtbdVk4MVpGZUWcgcpXS4mj35GjFdVgXiI+bFHZIEl7S+myQl92xTltbvuIAail 26L0w0byo+141FgqyUgqvaKDfd7JFVOq6HQEECkzU1lKZ4zj57E/tEUFd9iDLbF0QcBHCKiLQvN 0EIWx1aXzxLNVOg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The IOCB_DONTCACHE writeback path in generic_write_sync() calls filemap_flush_range() on every write, submitting writeback inline in the writer's context. Perf lock contention profiling shows the performance problem is not lock contention but the writeback submission work itself =E2=80=94 walking the page tree and submitting I/O blocks the writer for milliseconds, inflating p99.9 latency from 23ms (buffered) to 93ms (dontcache). Replace the inline filemap_flush_range() call with a wakeup_flusher_threads_bdi() call that kicks the BDI's flusher thread to drain dirty pages in the background. This moves writeback submission completely off the writer's hot path. The flusher thread handles writeback asynchronously, naturally coalescing and rate-limiting I/O without any explicit skip-if-busy or dirty pressure checks. Add WB_REASON_DONTCACHE as a new writeback reason for tracing visibility. Signed-off-by: Jeff Layton --- fs/fs-writeback.c | 14 ++++++++++++++ include/linux/backing-dev-defs.h | 1 + include/linux/fs.h | 6 ++---- include/trace/events/writeback.h | 3 ++- 4 files changed, 19 insertions(+), 5 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 3c75ee025bda..88dc31388a31 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -2466,6 +2466,20 @@ void wakeup_flusher_threads_bdi(struct backing_dev_i= nfo *bdi, rcu_read_unlock(); } =20 +/** + * filemap_dontcache_kick_writeback - kick flusher for IOCB_DONTCACHE writ= es + * @mapping: address_space that was just written to + * + * Wake the BDI flusher thread to start writeback of dirty pages in the + * background. + */ +void filemap_dontcache_kick_writeback(struct address_space *mapping) +{ + wakeup_flusher_threads_bdi(inode_to_bdi(mapping->host), + WB_REASON_DONTCACHE); +} +EXPORT_SYMBOL(filemap_dontcache_kick_writeback); + /* * Wakeup the flusher threads to start writeback of all currently dirty pa= ges */ diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-d= efs.h index c88fd4d37d1f..4a81c90a8928 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -55,6 +55,7 @@ enum wb_reason { */ WB_REASON_FORKER_THREAD, WB_REASON_FOREIGN_FLUSH, + WB_REASON_DONTCACHE, =20 WB_REASON_MAX, }; diff --git a/include/linux/fs.h b/include/linux/fs.h index 8b3dd145b25e..2fd36608ac73 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2610,6 +2610,7 @@ extern int __must_check file_write_and_wait_range(str= uct file *file, loff_t start, loff_t end); int filemap_flush_range(struct address_space *mapping, loff_t start, loff_t end); +void filemap_dontcache_kick_writeback(struct address_space *mapping); =20 static inline int file_write_and_wait(struct file *file) { @@ -2643,10 +2644,7 @@ static inline ssize_t generic_write_sync(struct kioc= b *iocb, ssize_t count) if (ret) return ret; } else if (iocb->ki_flags & IOCB_DONTCACHE) { - struct address_space *mapping =3D iocb->ki_filp->f_mapping; - - filemap_flush_range(mapping, iocb->ki_pos - count, - iocb->ki_pos - 1); + filemap_dontcache_kick_writeback(iocb->ki_filp->f_mapping); } =20 return count; diff --git a/include/trace/events/writeback.h b/include/trace/events/writeb= ack.h index 4d3d8c8f3a1b..9727af542699 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -44,7 +44,8 @@ EM( WB_REASON_PERIODIC, "periodic") \ EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \ EM( WB_REASON_FORKER_THREAD, "forker_thread") \ - EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush") + EM( WB_REASON_FOREIGN_FLUSH, "foreign_flush") \ + EMe(WB_REASON_DONTCACHE, "dontcache") =20 WB_WORK_REASON =20 --=20 2.53.0 From nobody Mon Jun 15 06:35:44 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DDD83D3333; Wed, 8 Apr 2026 14:25:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775658342; cv=none; b=hR7dmKPS5DzrhamEYHW4BYpxo36hfMexZaIyZZa/+mkHukbX28jVeSvnOZHKn5VVnMhUnYP90/iPiJM24KrGJghMjeKRAQZgDgiFJ1yy8Jr0h/tIaawk4TP70DUREB/1srxvpy+eu/Hfcq7o5XFzCBRX5NG3490y2cVngA9Ofls= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775658342; c=relaxed/simple; bh=taPEbFgqMwKXv25KgD+71KIz1AKpoNMNwUFNOGFERF8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=HhS1UNaLn6thAlyrL3NBBJirqwqblT1rHEDEXLhP5//NbpV8DQC8ypj+OvnrOY/2difRXNfYlMhrMsTx6jCb1r6IMwTq6XQyoUSmtd7m2JCftO4cK2fazph/IaN6QR6NlV+lXCsH9qEnCFIuAQV3+T5hcBMV3yEy0hambA3UUQg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TVCBe72b; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TVCBe72b" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D75BBC2BCB0; Wed, 8 Apr 2026 14:25:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775658341; bh=taPEbFgqMwKXv25KgD+71KIz1AKpoNMNwUFNOGFERF8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=TVCBe72buTeUVsVPcbc9+qfWlMmhxvbV+YyrzL4D9u2Jpo+xFh/G+/22ne6RiBpS1 VIxapl+NWF+yNstIly9Py4PS2opcjmJNk6r5u/z/7Ir1X7Frga5atbKT4/n76qV6bd t4UuOhp2Yz9tai5qWoGU9DXghfweyjuS/7eqedxuhH7yQDEQZR17Qt5niS8B/TbnJU aQKG57nqYY4MFyAswV4ja0b/QTIEflx6L5JFvxEcpuAWZ9fgRrTVtAwSlw+HcVO6AA dL8Vm2bvSFvSi6KBgrGiXjdIzoalTtQNCFLuRT13YDseUirGYxBE6rv0uwvVLhaXC7 cx9NexbuVe+kg== From: Jeff Layton Date: Wed, 08 Apr 2026 10:25:22 -0400 Subject: [PATCH v2 2/3] testing: add nfsd-io-bench NFS server benchmark suite Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260408-dontcache-v2-2-948dec1e756b@kernel.org> References: <20260408-dontcache-v2-0-948dec1e756b@kernel.org> In-Reply-To: <20260408-dontcache-v2-0-948dec1e756b@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , "Matthew Wilcox (Oracle)" , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Mike Snitzer , Jens Axboe , Chuck Lever Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=32977; i=jlayton@kernel.org; h=from:subject:message-id; bh=taPEbFgqMwKXv25KgD+71KIz1AKpoNMNwUFNOGFERF8=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBp1mVfN9bf/8o1DtS9XpzIGaPB1UpJIMkAnyS+d 878DF503GGJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCadZlXwAKCRAADmhBGVaC FR3qD/45o1cxW6ieDwXtdV5VOE/pAhY7wqhnUlmaiz1d1rod4Flgzwao3XD5JFwyC2PFhvMGQV9 amGNGJnnWU2QFm4bFFx1p69p9V/X9ITdo6bA/OtKO2OGhgiQNtD/++yK+9Ox8r4OZDJPWwrRx7O byGivTF+t9ZnQFglthwzISn4gd0X+5uCgadxruWvI7ayAI6cddI97gEeP5dJluel69aAHMLiO1Q p/KfvQHfuPlb1byK8yCvqM/7oMlHlsvGYP/fjUZppbmRFYOkhPwzlq8SLdEZurMM/vSjivyCy7I wS6MHTRKg6RQM4A9t/EaUL+BnM8+wBqo7RrDPE1gAR3BoeezhfMKJd4er38qkEfrWcTSVHOvmJu 0hToyICJ0fQ+ZAg4v3azCgmHWXOFuw/ilj5hfuI4dON3PofDQwuhEsUSHF/vfuqq/EpKxS/WrDl K86XdpItdvDFQSJkqIcS3ywkIHv4uWZqOVn3rwlPkXvx4OoQIUZ2ijPh1dE5L40a5XvqLycHweI LwdFCeO5JF6akOQy8SNKd4wE6liIG1jNria2LwF8W7cUM+BdNMSvMpiu5Jk2o374pIt5VQuR1lr /o1Zc7gdTqmE6HaZ8BPjFRUnTFH2hFusb1H0zZKe93BKXO4q303qk7on205bYBImXdsvqdD3iMU YAFnVLbMLEcI0ZQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add a benchmark suite for testing NFSD I/O mode performance using fio with the libnfs backend against an NFS server on localhost. Tests buffered, dontcache, and direct I/O modes via NFSD debugfs controls. Includes: - fio job files for sequential/random read/write, multi-writer, noisy-neighbor, and latency-sensitive reader workloads - run-benchmarks.sh: orchestrates test matrix with mode switching - parse-results.sh: extracts metrics from fio JSON output - setup-server.sh: configures NFS export for testing Signed-off-by: Jeff Layton --- .../testing/nfsd-io-bench/fio-jobs/lat-reader.fio | 15 + .../testing/nfsd-io-bench/fio-jobs/multi-write.fio | 14 + .../nfsd-io-bench/fio-jobs/noisy-writer.fio | 14 + tools/testing/nfsd-io-bench/fio-jobs/rand-read.fio | 15 + .../testing/nfsd-io-bench/fio-jobs/rand-write.fio | 15 + tools/testing/nfsd-io-bench/fio-jobs/seq-read.fio | 14 + tools/testing/nfsd-io-bench/fio-jobs/seq-write.fio | 14 + .../testing/nfsd-io-bench/scripts/parse-results.sh | 238 +++++++++ .../nfsd-io-bench/scripts/run-benchmarks.sh | 591 +++++++++++++++++= ++++ .../testing/nfsd-io-bench/scripts/setup-server.sh | 94 ++++ 10 files changed, 1024 insertions(+) diff --git a/tools/testing/nfsd-io-bench/fio-jobs/lat-reader.fio b/tools/te= sting/nfsd-io-bench/fio-jobs/lat-reader.fio new file mode 100644 index 000000000000..61af37e8b860 --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/lat-reader.fio @@ -0,0 +1,15 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D4k +numjobs=3D16 +runtime=3D300 +time_based=3D1 +group_reporting=3D1 +rw=3Drandread +log_avg_msec=3D1000 +write_bw_log=3Dlatreader +write_lat_log=3Dlatreader + +[lat_reader] diff --git a/tools/testing/nfsd-io-bench/fio-jobs/multi-write.fio b/tools/t= esting/nfsd-io-bench/fio-jobs/multi-write.fio new file mode 100644 index 000000000000..16b792aecabb --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/multi-write.fio @@ -0,0 +1,14 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D1M +numjobs=3D16 +time_based=3D0 +group_reporting=3D1 +rw=3Dwrite +log_avg_msec=3D1000 +write_bw_log=3Dmultiwrite +write_lat_log=3Dmultiwrite + +[writer] diff --git a/tools/testing/nfsd-io-bench/fio-jobs/noisy-writer.fio b/tools/= testing/nfsd-io-bench/fio-jobs/noisy-writer.fio new file mode 100644 index 000000000000..615154a7737e --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/noisy-writer.fio @@ -0,0 +1,14 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D1M +numjobs=3D16 +time_based=3D0 +group_reporting=3D1 +rw=3Dwrite +log_avg_msec=3D1000 +write_bw_log=3Dnoisywriter +write_lat_log=3Dnoisywriter + +[bulk_writer] diff --git a/tools/testing/nfsd-io-bench/fio-jobs/rand-read.fio b/tools/tes= ting/nfsd-io-bench/fio-jobs/rand-read.fio new file mode 100644 index 000000000000..501bae7416a8 --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/rand-read.fio @@ -0,0 +1,15 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D4k +numjobs=3D16 +runtime=3D300 +time_based=3D1 +group_reporting=3D1 +rw=3Drandread +log_avg_msec=3D1000 +write_bw_log=3Drandread +write_lat_log=3Drandread + +[randread] diff --git a/tools/testing/nfsd-io-bench/fio-jobs/rand-write.fio b/tools/te= sting/nfsd-io-bench/fio-jobs/rand-write.fio new file mode 100644 index 000000000000..d891d04197ae --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/rand-write.fio @@ -0,0 +1,15 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D64k +numjobs=3D16 +runtime=3D300 +time_based=3D1 +group_reporting=3D1 +rw=3Drandwrite +log_avg_msec=3D1000 +write_bw_log=3Drandwrite +write_lat_log=3Drandwrite + +[randwrite] diff --git a/tools/testing/nfsd-io-bench/fio-jobs/seq-read.fio b/tools/test= ing/nfsd-io-bench/fio-jobs/seq-read.fio new file mode 100644 index 000000000000..6e24ab355026 --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/seq-read.fio @@ -0,0 +1,14 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D1M +numjobs=3D16 +time_based=3D0 +group_reporting=3D1 +rw=3Dread +log_avg_msec=3D1000 +write_bw_log=3Dseqread +write_lat_log=3Dseqread + +[seqread] diff --git a/tools/testing/nfsd-io-bench/fio-jobs/seq-write.fio b/tools/tes= ting/nfsd-io-bench/fio-jobs/seq-write.fio new file mode 100644 index 000000000000..260858e345f5 --- /dev/null +++ b/tools/testing/nfsd-io-bench/fio-jobs/seq-write.fio @@ -0,0 +1,14 @@ +[global] +ioengine=3Dnfs +nfs_url=3Dnfs://localhost/export +direct=3D0 +bs=3D1M +numjobs=3D16 +time_based=3D0 +group_reporting=3D1 +rw=3Dwrite +log_avg_msec=3D1000 +write_bw_log=3Dseqwrite +write_lat_log=3Dseqwrite + +[seqwrite] diff --git a/tools/testing/nfsd-io-bench/scripts/parse-results.sh b/tools/t= esting/nfsd-io-bench/scripts/parse-results.sh new file mode 100755 index 000000000000..0427d411db04 --- /dev/null +++ b/tools/testing/nfsd-io-bench/scripts/parse-results.sh @@ -0,0 +1,238 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Parse fio JSON output and generate comparison tables. +# +# Usage: ./parse-results.sh + +set -euo pipefail + +if [ $# -lt 1 ]; then + echo "Usage: $0 " + exit 1 +fi + +RESULTS_DIR=3D"$1" + +if ! command -v jq &>/dev/null; then + echo "ERROR: jq is required" + exit 1 +fi + +# Extract metrics from a single fio JSON result +extract_metrics() { + local json_file=3D$1 + local rw_type=3D$2 # read or write + + if [ ! -f "$json_file" ]; then + echo "N/A N/A N/A N/A N/A N/A" + return + fi + + jq -r --arg rw "$rw_type" ' + .jobs[0][$rw] as $d | + [ + (($d.bw // 0) / 1024 | . * 10 | round / 10), # MB/s + ($d.iops // 0), # IOPS + ((($d.clat_ns.mean // 0) / 1000) | . * 10 | round / 10), # avg lat us + (($d.clat_ns.percentile["50.000000"] // 0) / 1000), # p50 us + (($d.clat_ns.percentile["99.000000"] // 0) / 1000), # p99 us + (($d.clat_ns.percentile["99.900000"] // 0) / 1000) # p99.9 us + ] | @tsv + ' "$json_file" 2>/dev/null || echo "N/A N/A N/A N/A N/A N/A" +} + +# Extract server CPU from vmstat log (average sys%) +extract_cpu() { + local vmstat_log=3D$1 + if [ ! -f "$vmstat_log" ]; then + echo "N/A" + return + fi + # vmstat columns: us sy id wa st =E2=80=94 skip header lines + awk 'NR>2 {sum+=3D$14; n++} END {if(n>0) printf "%.1f", sum/n; else print= "N/A"}' \ + "$vmstat_log" 2>/dev/null || echo "N/A" +} + +# Extract peak dirty pages from meminfo log +extract_peak_dirty() { + local meminfo_log=3D$1 + if [ ! -f "$meminfo_log" ]; then + echo "N/A" + return + fi + grep "^Dirty:" "$meminfo_log" | awk '{print $2}' | sort -n | tail -1 || e= cho "N/A" +} + +# Extract peak cached from meminfo log +extract_peak_cached() { + local meminfo_log=3D$1 + if [ ! -f "$meminfo_log" ]; then + echo "N/A" + return + fi + grep "^Cached:" "$meminfo_log" | awk '{print $2}' | sort -n | tail -1 || = echo "N/A" +} + +print_separator() { + printf '%*s\n' 120 '' | tr ' ' '-' +} + +######################################################################## +# Deliverable 1: Single-client results +######################################################################## +echo "" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo " Deliverable 1: Single-Client fio Benchmarks" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo "" + +for workload in seq-write rand-write seq-read rand-read; do + case $workload in + seq-write|rand-write) rw_type=3D"write" ;; + seq-read|rand-read) rw_type=3D"read" ;; + esac + + echo "--- $workload ---" + printf "%-16s %10s %10s %10s %10s %10s %10s %10s %12s %12s\n" \ + "Mode" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" "Sys CPU%= " "PeakDirty(kB)" "PeakCache(kB)" + print_separator + + for mode in buffered dontcache direct; do + dir=3D"${RESULTS_DIR}/${workload}/${mode}" + json_file=3D$(find "$dir" -name '*.json' -not -name 'client*' 2>/dev/nul= l | head -1 || true) + if [ -z "$json_file" ]; then + printf "%-16s %10s\n" "$mode" "(no data)" + continue + fi + + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "$rw_type")" + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + cached=3D$(extract_peak_cached "${dir}/meminfo.log") + + printf "%-16s %10s %10s %10s %10s %10s %10s %10s %12s %12s\n" \ + "$mode" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" \ + "$cpu" "${dirty:-N/A}" "${cached:-N/A}" + done + echo "" +done + +######################################################################## +# Deliverable 2: Multi-client results +######################################################################## +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo " Deliverable 2: Noisy-Neighbor Benchmarks" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo "" + +# Scenario A: Multiple writers +echo "--- Scenario A: Multiple Writers ---" +for mode in buffered dontcache direct; do + dir=3D"${RESULTS_DIR}/multi-write/${mode}" + if [ ! -d "$dir" ]; then + continue + fi + + echo " Mode: $mode" + printf " %-10s %10s %10s %10s %10s %10s %10s\n" \ + "Client" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" + + total_bw=3D0 + count=3D0 + for json_file in "${dir}"/client*.json; do + [ -f "$json_file" ] || continue + client=3D$(basename "$json_file" .json) + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "write")" + printf " %-10s %10s %10s %10s %10s %10s %10s\n" \ + "$client" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + total_bw=3D$(echo "$total_bw + ${mbps:-0}" | bc 2>/dev/null || echo "$to= tal_bw") + count=3D$(( count + 1 )) + done + + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + printf " Aggregate BW: %s MB/s | Sys CPU: %s%% | Peak Dirty: %s kB\n" \ + "$total_bw" "$cpu" "${dirty:-N/A}" + echo "" +done + +# Scenario C: Noisy neighbor +echo "--- Scenario C: Noisy Writer + Latency-Sensitive Readers ---" +for mode in buffered dontcache direct; do + dir=3D"${RESULTS_DIR}/noisy-neighbor/${mode}" + if [ ! -d "$dir" ]; then + continue + fi + + echo " Mode: $mode" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Job" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" + + # Writer + if [ -f "${dir}/noisy_writer.json" ]; then + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "${dir}/noisy_writer.json" "write")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Bulk writer" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + fi + + # Readers + for json_file in "${dir}"/reader*.json; do + [ -f "$json_file" ] || continue + reader=3D$(basename "$json_file" .json) + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "read")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "$reader" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + done + + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + printf " Sys CPU: %s%% | Peak Dirty: %s kB\n" "$cpu" "${dirty:-N/A}" + echo "" +done + +# Scenario D: Mixed-mode noisy neighbor +echo "--- Scenario D: Mixed-Mode Noisy Writer + Readers ---" +for dir in "${RESULTS_DIR}"/noisy-neighbor-mixed/*/; do + [ -d "$dir" ] || continue + label=3D$(basename "$dir") + + echo " Mode: $label" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Job" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" + + # Writer + if [ -f "${dir}/noisy_writer.json" ]; then + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "${dir}/noisy_writer.json" "write")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Bulk writer" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + fi + + # Readers + for json_file in "${dir}"/reader*.json; do + [ -f "$json_file" ] || continue + reader=3D$(basename "$json_file" .json) + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "read")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "$reader" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + done + + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + printf " Sys CPU: %s%% | Peak Dirty: %s kB\n" "$cpu" "${dirty:-N/A}" + echo "" +done + +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo " System Info" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +if [ -f "${RESULTS_DIR}/sysinfo.txt" ]; then + head -6 "${RESULTS_DIR}/sysinfo.txt" +fi +echo "" diff --git a/tools/testing/nfsd-io-bench/scripts/run-benchmarks.sh b/tools/= testing/nfsd-io-bench/scripts/run-benchmarks.sh new file mode 100755 index 000000000000..2b0cf6e79dff --- /dev/null +++ b/tools/testing/nfsd-io-bench/scripts/run-benchmarks.sh @@ -0,0 +1,591 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# NFS server I/O mode benchmark suite +# +# Runs fio with the NFS ioengine against an NFS server on localhost, +# testing buffered, dontcache, and direct I/O modes. +# +# Usage: ./run-benchmarks.sh [OPTIONS] +# +# Options: +# -e EXPORT_PATH Server export path (default: /export) +# -s SIZE fio file size, should be >=3D 2x RAM (default: auto-d= etect) +# -r RESULTS_DIR Where to store results (default: ./results) +# -n NFS_VER NFS version: 3 or 4 (default: 3) +# -j FIO_JOBS_DIR Path to fio job files (default: ../fio-jobs) +# -d Dry run: print commands without executing +# -h Show this help + +set -euo pipefail + +# Defaults +EXPORT_PATH=3D"/export" +SIZE=3D"" +RESULTS_DIR=3D"./results" +NFS_VER=3D3 +SCRIPT_DIR=3D"$(cd "$(dirname "$0")" && pwd)" +FIO_JOBS_DIR=3D"${SCRIPT_DIR}/../fio-jobs" +DRY_RUN=3D0 +MODES=3D"0 1 2" +PERF_LOCK=3D0 + +DEBUGFS_BASE=3D"/sys/kernel/debug/nfsd" +IO_CACHE_READ=3D"${DEBUGFS_BASE}/io_cache_read" +IO_CACHE_WRITE=3D"${DEBUGFS_BASE}/io_cache_write" +DISABLE_SPLICE=3D"${DEBUGFS_BASE}/disable-splice-read" + +usage() { + echo "Usage: $0 [OPTIONS]" + echo " -e EXPORT_PATH Server export path (default: /export)" + echo " -s SIZE fio file size (default: 2x RAM)" + echo " -r RESULTS_DIR Results directory (default: ./results)" + echo " -n NFS_VER NFS version: 3 or 4 (default: 3)" + echo " -j FIO_JOBS_DIR Path to fio job files" + echo " -D Dontcache only (skip buffered and direct tests)" + echo " -p Profile kernel lock contention with perf lock" + echo " -d Dry run" + echo " -h Help" + exit 1 +} + +while getopts "e:s:r:n:j:Dpdh" opt; do + case $opt in + e) EXPORT_PATH=3D"$OPTARG" ;; + s) SIZE=3D"$OPTARG" ;; + r) RESULTS_DIR=3D"$OPTARG" ;; + n) NFS_VER=3D"$OPTARG" ;; + j) FIO_JOBS_DIR=3D"$OPTARG" ;; + D) MODES=3D"1" ;; + p) PERF_LOCK=3D1 ;; + d) DRY_RUN=3D1 ;; + h) usage ;; + *) usage ;; + esac +done + +# Auto-detect size: 2x total RAM +if [ -z "$SIZE" ]; then + MEM_KB=3D$(awk '/MemTotal/ {print $2}' /proc/meminfo) + MEM_GB=3D$(( MEM_KB / 1024 / 1024 )) + SIZE=3D"$(( MEM_GB * 2 ))G" + echo "Auto-detected RAM: ${MEM_GB}G, using file size: ${SIZE}" +fi + + +log() { + echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" +} + +run_cmd() { + if [ "$DRY_RUN" -eq 1 ]; then + echo " [DRY RUN] $*" + else + "$@" + fi +} + +# Preflight checks +preflight() { + log "=3D=3D=3D Preflight checks =3D=3D=3D" + + if ! command -v fio &>/dev/null; then + echo "ERROR: fio not found in PATH" + exit 1 + fi + + # Check fio has nfs ioengine + if ! fio --enghelp=3Dnfs &>/dev/null; then + echo "ERROR: fio does not have the nfs ioengine (needs libnfs)" + exit 1 + fi + + # Check debugfs knobs exist + for knob in "$IO_CACHE_READ" "$IO_CACHE_WRITE" "$DISABLE_SPLICE"; do + if [ ! -f "$knob" ]; then + echo "ERROR: $knob not found. Is the kernel new enough?" + exit 1 + fi + done + + # Check NFS server is exporting + if ! showmount -e localhost 2>/dev/null | grep -q "$EXPORT_PATH"; then + echo "WARNING: $EXPORT_PATH not in showmount output, proceeding anyway" + fi + + # Print system info + echo "Kernel: $(uname -r)" + echo "RAM: $(awk '/MemTotal/ {printf "%.1f GB", $2/1024/1024}' /pr= oc/meminfo)" + echo "Export: $EXPORT_PATH" + echo "NFS ver: $NFS_VER" + echo "File size: $SIZE" + echo "Results: $RESULTS_DIR" + echo "" +} + +# Set server I/O mode via debugfs +set_io_mode() { + local cache_write=3D$1 + local cache_read=3D$2 + local splice_off=3D$3 + + log "Setting io_cache_write=3D$cache_write io_cache_read=3D$cache_read di= sable-splice-read=3D$splice_off" + run_cmd bash -c "echo $cache_write > $IO_CACHE_WRITE" + run_cmd bash -c "echo $cache_read > $IO_CACHE_READ" + run_cmd bash -c "echo $splice_off > $DISABLE_SPLICE" +} + +# Drop page cache on server +drop_caches() { + log "Dropping page cache" + run_cmd bash -c "sync && echo 3 > /proc/sys/vm/drop_caches" + sleep 1 +} + +# Start background server monitoring +start_monitors() { + local outdir=3D$1 + + log "Starting server monitors in $outdir" + run_cmd vmstat 1 > "${outdir}/vmstat.log" 2>&1 & + VMSTAT_PID=3D$! + + run_cmd iostat -x 1 > "${outdir}/iostat.log" 2>&1 & + IOSTAT_PID=3D$! + + # Sample /proc/meminfo every second + (while true; do + echo "=3D=3D=3D $(date '+%s') =3D=3D=3D" + cat /proc/meminfo + sleep 1 + done) > "${outdir}/meminfo.log" 2>&1 & + MEMINFO_PID=3D$! +} + +# Stop background monitors +stop_monitors() { + log "Stopping monitors" + kill "$VMSTAT_PID" "$IOSTAT_PID" "$MEMINFO_PID" 2>/dev/null || true + wait "$VMSTAT_PID" "$IOSTAT_PID" "$MEMINFO_PID" 2>/dev/null || true +} + +# perf lock profiling =E2=80=94 uses BPF-based live contention tracing +PERF_LOCK_PID=3D"" + +start_perf_lock() { + local outdir=3D$1 + + if [ "$PERF_LOCK" -ne 1 ]; then + return + fi + + log "Starting perf lock contention tracing" + perf lock contention -a -b --max-stack 8 \ + > "${outdir}/perf-lock-contention.txt" 2>&1 & + PERF_LOCK_PID=3D$! +} + +stop_perf_lock() { + local outdir=3D$1 + + if [ -z "$PERF_LOCK_PID" ]; then + return + fi + + log "Stopping perf lock contention tracing" + kill -TERM "$PERF_LOCK_PID" 2>/dev/null || true + wait "$PERF_LOCK_PID" 2>/dev/null || true + PERF_LOCK_PID=3D"" +} + +# Run a single fio benchmark. +# nfs_url is set in the job files; we pass --filename and --size on +# the command line to vary the target file and data volume per run. +# Pass "keep" as 5th arg to preserve the test file after the run. +run_fio() { + local job_file=3D$1 + local outdir=3D$2 + local filename=3D$3 + local fio_size=3D${4:-$SIZE} + local keep=3D${5:-} + + local job_name + job_name=3D$(basename "$job_file" .fio) + + log "Running fio job: $job_name -> $outdir (file=3D$filename size=3D$fio_= size)" + mkdir -p "$outdir" + + drop_caches + start_monitors "$outdir" + # Skip perf lock profiling for precreate/setup runs + [ "$keep" !=3D "keep" ] && start_perf_lock "$outdir" + + run_cmd fio "$job_file" \ + --output-format=3Djson \ + --output=3D"${outdir}/${job_name}.json" \ + --filename=3D"$filename" \ + --size=3D"$fio_size" + + [ "$keep" !=3D "keep" ] && stop_perf_lock "$outdir" + stop_monitors + + log "Finished: $job_name" + + # Clean up test file to free disk space unless told to keep it + if [ "$keep" !=3D "keep" ]; then + cleanup_test_files "$filename" + fi +} + +# Remove test files from the export to free disk space +cleanup_test_files() { + local filename + for filename in "$@"; do + local filepath=3D"${EXPORT_PATH}/${filename}" + log "Cleaning up: $filepath" + run_cmd rm -f "$filepath" + done +} + +# Ensure parent directories exist under the export for a given filename +ensure_export_dirs() { + local filename + for filename in "$@"; do + local dirpath=3D"${EXPORT_PATH}/$(dirname "$filename")" + if [ "$dirpath" !=3D "${EXPORT_PATH}/." ] && [ ! -d "$dirpath" ]; then + log "Creating directory: $dirpath" + run_cmd mkdir -p "$dirpath" + fi + done +} + +# Mode name from numeric value +mode_name() { + case $1 in + 0) echo "buffered" ;; + 1) echo "dontcache" ;; + 2) echo "direct" ;; + esac +} + +######################################################################## +# Deliverable 1: Single-client fio benchmarks +######################################################################## +run_deliverable1() { + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + log "Deliverable 1: Single-client fio benchmarks" + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + + # Write test matrix: + # mode 0 (buffered): splice on (default) + # mode 1 (dontcache): splice off (required) + # mode 2 (direct): splice off (required) + + # Sequential write + for wmode in $MODES; do + local mname + mname=3D$(mode_name $wmode) + local splice_off=3D0 + [ "$wmode" -ne 0 ] && splice_off=3D1 + + drop_caches + set_io_mode "$wmode" 0 "$splice_off" + run_fio "${FIO_JOBS_DIR}/seq-write.fio" \ + "${RESULTS_DIR}/seq-write/${mname}" \ + "seq-write_testfile" + done + + # Random write + for wmode in $MODES; do + local mname + mname=3D$(mode_name $wmode) + local splice_off=3D0 + [ "$wmode" -ne 0 ] && splice_off=3D1 + + drop_caches + set_io_mode "$wmode" 0 "$splice_off" + run_fio "${FIO_JOBS_DIR}/rand-write.fio" \ + "${RESULTS_DIR}/rand-write/${mname}" \ + "rand-write_testfile" + done + + # Sequential read =E2=80=94 vary read mode, write stays buffered + # Pre-create the file for reading + log "Pre-creating sequential read test file" + set_io_mode 0 0 0 + run_fio "${FIO_JOBS_DIR}/seq-write.fio" \ + "${RESULTS_DIR}/seq-read/precreate" \ + "seq-read_testfile" "$SIZE" "keep" + + # shellcheck disable=3DSC2086 + local last_mode + last_mode=3D$(echo $MODES | awk '{print $NF}') + + for rmode in $MODES; do + local mname + mname=3D$(mode_name $rmode) + local splice_off=3D0 + [ "$rmode" -ne 0 ] && splice_off=3D1 + # Keep file for subsequent modes; clean up after last + local keep=3D"keep" + [ "$rmode" =3D "$last_mode" ] && keep=3D"" + + drop_caches + set_io_mode 0 "$rmode" "$splice_off" + run_fio "${FIO_JOBS_DIR}/seq-read.fio" \ + "${RESULTS_DIR}/seq-read/${mname}" \ + "seq-read_testfile" "$SIZE" "$keep" + done + + # Random read =E2=80=94 vary read mode, write stays buffered + # Pre-create the file for reading + log "Pre-creating random read test file" + set_io_mode 0 0 0 + run_fio "${FIO_JOBS_DIR}/seq-write.fio" \ + "${RESULTS_DIR}/rand-read/precreate" \ + "rand-read_testfile" "$SIZE" "keep" + + for rmode in $MODES; do + local mname + mname=3D$(mode_name $rmode) + local splice_off=3D0 + [ "$rmode" -ne 0 ] && splice_off=3D1 + # Keep file for subsequent modes; clean up after last + local keep=3D"keep" + [ "$rmode" =3D "$last_mode" ] && keep=3D"" + + drop_caches + set_io_mode 0 "$rmode" "$splice_off" + run_fio "${FIO_JOBS_DIR}/rand-read.fio" \ + "${RESULTS_DIR}/rand-read/${mname}" \ + "rand-read_testfile" "$SIZE" "$keep" + done +} + +######################################################################## +# Deliverable 2: Multi-client (simulated with multiple fio jobs) +######################################################################## +run_deliverable2() { + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + log "Deliverable 2: Noisy-neighbor benchmarks" + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + + local num_clients=3D4 + local client_size + local mem_kb + mem_kb=3D$(awk '/MemTotal/ {print $2}' /proc/meminfo) + # Each client gets RAM/num_clients so total > RAM + client_size=3D"$(( mem_kb / 1024 / num_clients ))M" + + # Scenario A: Multiple writers + for mode in $MODES; do + local mname + mname=3D$(mode_name $mode) + local splice_off=3D0 + [ "$mode" -ne 0 ] && splice_off=3D1 + local outdir=3D"${RESULTS_DIR}/multi-write/${mname}" + mkdir -p "$outdir" + + set_io_mode "$mode" "$mode" "$splice_off" + drop_caches + + # Ensure client directories exist on export + for i in $(seq 1 $num_clients); do + ensure_export_dirs "client${i}/testfile" + done + + start_monitors "$outdir" + start_perf_lock "$outdir" + + # Launch N parallel fio writers + local pids=3D() + for i in $(seq 1 $num_clients); do + run_cmd fio "${FIO_JOBS_DIR}/multi-write.fio" \ + --output-format=3Djson \ + --output=3D"${outdir}/client${i}.json" \ + --filename=3D"client${i}/testfile" \ + --size=3D"$client_size" & + pids+=3D($!) + done + + # Wait for all + local rc=3D0 + for pid in "${pids[@]}"; do + wait "$pid" || rc=3D$? + done + + stop_perf_lock "$outdir" + stop_monitors + [ $rc -ne 0 ] && log "WARNING: some fio jobs exited non-zero" + + # Clean up test files + for i in $(seq 1 $num_clients); do + cleanup_test_files "client${i}/testfile" + done + done + + # Scenario C: Noisy writer + latency-sensitive readers + for mode in $MODES; do + local mname + mname=3D$(mode_name $mode) + local splice_off=3D0 + [ "$mode" -ne 0 ] && splice_off=3D1 + local outdir=3D"${RESULTS_DIR}/noisy-neighbor/${mname}" + mkdir -p "$outdir" + + set_io_mode "$mode" "$mode" "$splice_off" + drop_caches + + # Pre-create read files for latency readers + for i in $(seq 1 $(( num_clients - 1 ))); do + ensure_export_dirs "reader${i}/readfile" + log "Pre-creating read file for reader $i" + run_fio "${FIO_JOBS_DIR}/multi-write.fio" \ + "${outdir}/precreate_reader${i}" \ + "reader${i}/readfile" \ + "512M" "keep" + done + drop_caches + ensure_export_dirs "bulk/testfile" + start_monitors "$outdir" + start_perf_lock "$outdir" + + # Noisy writer + run_cmd fio "${FIO_JOBS_DIR}/noisy-writer.fio" \ + --output-format=3Djson \ + --output=3D"${outdir}/noisy_writer.json" \ + --filename=3D"bulk/testfile" \ + --size=3D"$SIZE" & + local writer_pid=3D$! + + # Latency-sensitive readers + local reader_pids=3D() + for i in $(seq 1 $(( num_clients - 1 ))); do + run_cmd fio "${FIO_JOBS_DIR}/lat-reader.fio" \ + --output-format=3Djson \ + --output=3D"${outdir}/reader${i}.json" \ + --filename=3D"reader${i}/readfile" \ + --size=3D"512M" & + reader_pids+=3D($!) + done + + local rc=3D0 + wait "$writer_pid" || rc=3D$? + for pid in "${reader_pids[@]}"; do + wait "$pid" || rc=3D$? + done + + stop_perf_lock "$outdir" + stop_monitors + [ $rc -ne 0 ] && log "WARNING: some fio jobs exited non-zero" + + # Clean up test files + cleanup_test_files "bulk/testfile" + for i in $(seq 1 $(( num_clients - 1 ))); do + cleanup_test_files "reader${i}/readfile" + done + done + # Scenario D: Mixed-mode noisy neighbor + # Test write/read mode combinations where the writer uses a + # cache-friendly mode and readers use buffered reads to benefit + # from warm cache. + local mixed_modes=3D( + # write_mode read_mode label + "1 0 dontcache-w_buffered-r" + ) + + for combo in "${mixed_modes[@]}"; do + local wmode rmode label + read -r wmode rmode label <<< "$combo" + local splice_off=3D0 + [ "$wmode" -ne 0 ] && splice_off=3D1 + local outdir=3D"${RESULTS_DIR}/noisy-neighbor-mixed/${label}" + mkdir -p "$outdir" + + set_io_mode "$wmode" "$rmode" "$splice_off" + drop_caches + + # Pre-create read files for latency readers + for i in $(seq 1 $(( num_clients - 1 ))); do + ensure_export_dirs "reader${i}/readfile" + log "Pre-creating read file for reader $i" + run_fio "${FIO_JOBS_DIR}/multi-write.fio" \ + "${outdir}/precreate_reader${i}" \ + "reader${i}/readfile" \ + "512M" "keep" + done + drop_caches + ensure_export_dirs "bulk/testfile" + start_monitors "$outdir" + start_perf_lock "$outdir" + + # Noisy writer + run_cmd fio "${FIO_JOBS_DIR}/noisy-writer.fio" \ + --output-format=3Djson \ + --output=3D"${outdir}/noisy_writer.json" \ + --filename=3D"bulk/testfile" \ + --size=3D"$SIZE" & + local writer_pid=3D$! + + # Latency-sensitive readers + local reader_pids=3D() + for i in $(seq 1 $(( num_clients - 1 ))); do + run_cmd fio "${FIO_JOBS_DIR}/lat-reader.fio" \ + --output-format=3Djson \ + --output=3D"${outdir}/reader${i}.json" \ + --filename=3D"reader${i}/readfile" \ + --size=3D"512M" & + reader_pids+=3D($!) + done + + local rc=3D0 + wait "$writer_pid" || rc=3D$? + for pid in "${reader_pids[@]}"; do + wait "$pid" || rc=3D$? + done + + stop_perf_lock "$outdir" + stop_monitors + [ $rc -ne 0 ] && log "WARNING: some fio jobs exited non-zero" + + # Clean up test files + cleanup_test_files "bulk/testfile" + for i in $(seq 1 $(( num_clients - 1 ))); do + cleanup_test_files "reader${i}/readfile" + done + done +} + +######################################################################## +# Main +######################################################################## +preflight + +TIMESTAMP=3D$(date '+%Y%m%d-%H%M%S') +RESULTS_DIR=3D"${RESULTS_DIR}/${TIMESTAMP}" +mkdir -p "$RESULTS_DIR" + +# Save system info +{ + echo "Timestamp: $TIMESTAMP" + echo "Kernel: $(uname -r)" + echo "Hostname: $(hostname)" + echo "NFS version: $NFS_VER" + echo "File size: $SIZE" + echo "Export: $EXPORT_PATH" + cat /proc/meminfo +} > "${RESULTS_DIR}/sysinfo.txt" + +log "Results will be saved to: $RESULTS_DIR" + +run_deliverable1 +run_deliverable2 + +# Reset to defaults +set_io_mode 0 0 0 + +log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +log "All benchmarks complete." +log "Results in: $RESULTS_DIR" +log "Run: scripts/parse-results.sh $RESULTS_DIR" +log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" diff --git a/tools/testing/nfsd-io-bench/scripts/setup-server.sh b/tools/te= sting/nfsd-io-bench/scripts/setup-server.sh new file mode 100755 index 000000000000..0efdd74a705e --- /dev/null +++ b/tools/testing/nfsd-io-bench/scripts/setup-server.sh @@ -0,0 +1,94 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# One-time setup script for the NFS test server. +# Run this once before running benchmarks. +# +# Usage: sudo ./setup-server.sh [EXPORT_PATH] + +set -euo pipefail + +EXPORT_PATH=3D"${1:-/export}" +FSTYPE=3D"ext4" + +log() { + echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*" +} + +if [ "$(id -u)" -ne 0 ]; then + echo "ERROR: must run as root" + exit 1 +fi + +# Check for required tools +for cmd in fio exportfs showmount jq; do + if ! command -v "$cmd" &>/dev/null; then + echo "WARNING: $cmd not found, attempting install" + dnf install -y "$cmd" 2>/dev/null || \ + apt-get install -y "$cmd" 2>/dev/null || \ + echo "ERROR: cannot install $cmd, please install manually" + fi +done + +# Check fio has nfs ioengine +if ! fio --enghelp=3Dnfs &>/dev/null; then + echo "ERROR: fio nfs ioengine not available." + echo "You may need to install fio with libnfs support." + echo "Try: dnf install fio libnfs-devel (or build fio from source with -= -enable-nfs)" + exit 1 +fi + +# Create export directory if needed +if [ ! -d "$EXPORT_PATH" ]; then + log "Creating export directory: $EXPORT_PATH" + mkdir -p "$EXPORT_PATH" +fi + +# Create subdirectories for multi-client tests +for i in 1 2 3 4; do + mkdir -p "${EXPORT_PATH}/client${i}" + mkdir -p "${EXPORT_PATH}/reader${i}" +done +mkdir -p "${EXPORT_PATH}/bulk" + +# Check if already exported +if ! exportfs -s 2>/dev/null | grep -q "$EXPORT_PATH"; then + log "Adding NFS export for $EXPORT_PATH" + if ! grep -q "$EXPORT_PATH" /etc/exports 2>/dev/null; then + echo "${EXPORT_PATH} 127.0.0.1/32(rw,sync,no_root_squash,no_subtree_chec= k)" >> /etc/exports + fi + exportfs -ra +fi + +# Ensure NFS server is running +if ! systemctl is-active --quiet nfs-server 2>/dev/null; then + log "Starting NFS server" + systemctl start nfs-server +fi + +# Verify export +log "Current exports:" +showmount -e localhost + +# Check debugfs knobs +log "Checking debugfs knobs:" +DEBUGFS_BASE=3D"/sys/kernel/debug/nfsd" +for knob in io_cache_read io_cache_write disable-splice-read; do + if [ -f "${DEBUGFS_BASE}/${knob}" ]; then + echo " ${knob} =3D $(cat "${DEBUGFS_BASE}/${knob}")" + else + echo " ${knob}: NOT FOUND (kernel may be too old)" + fi +done + +# Print system summary +echo "" +log "=3D=3D=3D System Summary =3D=3D=3D" +echo "Kernel: $(uname -r)" +echo "RAM: $(awk '/MemTotal/ {printf "%.1f GB", $2/1024/1024}' /pr= oc/meminfo)" +echo "Export: $EXPORT_PATH" +echo "Filesystem: $(df -T "$EXPORT_PATH" | awk 'NR=3D=3D2 {print $2}')" +echo "Disk: $(df -h "$EXPORT_PATH" | awk 'NR=3D=3D2 {print $2, "tot= al,", $4, "free"}')" +echo "" +log "Setup complete. Run benchmarks with:" +echo " sudo ./scripts/run-benchmarks.sh -e $EXPORT_PATH" --=20 2.53.0 From nobody Mon Jun 15 06:35:44 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07D412ED154; Wed, 8 Apr 2026 14:25:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775658350; cv=none; b=YqsPgQYP80Cu3ZSfPW/xtsZjMJaQaSgKWUNNcjEUfhXGrX6wSWAUPYpWS0YclSWhkyGg96RGN4FuNVWqncwEutAjRK+E1vALgui1x5bSMDmydg6dRvezkn8tkShHzpx0dJJAHyxCWaKlTiYjyCCh5p6h5m6oovPNLTqG28jJkxA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775658350; c=relaxed/simple; bh=9fMBcndRelCdNBsbx+5D1lanDNkdDt1zAqn86GBZOq0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Dn18jCLqoPk7IfnnfvkvrdMv9y3tHKfvrN3l3txQdYAvLYwBQWPd1mB+RKZNFYhE93bGnGKSt+rG1YycSh3xPiGO2hqMzK3R0t6vGsRjKKk0RXkSNtWz2M9yMqtE36Hp7SiQyjTlWYVMRSHUXWabnBalP719/PJG2xv1lZVPGKE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=osLLI3Zx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="osLLI3Zx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD0F7C2BCB3; Wed, 8 Apr 2026 14:25:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775658343; bh=9fMBcndRelCdNBsbx+5D1lanDNkdDt1zAqn86GBZOq0=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=osLLI3ZxOAAEt9/AJxrbGALPPn5SVxUwNpOup0G7/sCjltROaEc+tHlg2N8wS8hTP BMVvNf28eshboEH10pJvoZyHN3HYNAFR5DCwV/I6LVdQxm4S9VS9hlbQyi00ye8+Yt 9CTlagztrBbEauO00ysLSaAbDeTUO7kvsS6KQgozg0x3JtgKHRQ4hPfoBP6Eh/lYz3 LIXAz2oFSLbId8uNz8WMozCDfw0+yx42FkIUClP6PMsLEGFnaqXLNuC9k6HEZ2n8s3 EYk2ExApV0Md85Mss5jzzkc9SApoysMkeZ7aX2zJIW3VPvOxGUV4bLFsm6Pa7QdeHv qOGGhocmktCGQ== From: Jeff Layton Date: Wed, 08 Apr 2026 10:25:23 -0400 Subject: [PATCH v2 3/3] testing: add dontcache-bench local filesystem benchmark suite Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260408-dontcache-v2-3-948dec1e756b@kernel.org> References: <20260408-dontcache-v2-0-948dec1e756b@kernel.org> In-Reply-To: <20260408-dontcache-v2-0-948dec1e756b@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , "Matthew Wilcox (Oracle)" , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Mike Snitzer , Jens Axboe , Chuck Lever Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=28518; i=jlayton@kernel.org; h=from:subject:message-id; bh=9fMBcndRelCdNBsbx+5D1lanDNkdDt1zAqn86GBZOq0=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBp1mVfP/c/dAE1yzJV9kRwK/p0Y8r70LYmvfNZ+ j2HJ2K70WiJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCadZlXwAKCRAADmhBGVaC FWRDEACWvOr4eTPWHn6OhGZcxaNqME0yk/2CrKm4h0VvUcHccL16GHaWzZjB7pllHxVsTzwiJvV IFlfiBa1yjDWSR4Y6ykf2OLY0M67TrkvXxU/MLSD/gEswl3MUgpVV0xD1bIQqxhRtRi5RnC40lw b/EJaAZwmwEe+QVwpdjhg85TmcpI2LtuXOeUhimJwi14KbHNWPFiDB1AHPouMg735Wm2ju7AkO1 kMNXkO7N50sKp964YXxhyXUAva9jDVjQ9HlLS2n/iJY0PrET/g0n1iFXOjxtbpTX1BLPyX05wOd vnOmR2J79Pb1ZPAY0PZNvdFuIr+x8xH2ZZSoOeF1vZM7ydk7qmbw49bXHe4ZA3daoz3ufAH249C SwXLlcJDfXqo/rg+POiLZh/EShvb3MnuxAWUS7csrxkm2Zwfl6nLYL9fhd/ZVfqxauVzz/T1dN7 n4gRl0xvVoh236XlM0Rk2YAAK97PIvLbKJ6sgya9cHcgsKjWB90GGaSLQDT9fNS7jRzvb7/D6nY rJDKqYT+qDo+XJh/wZUWWGmOtfH1P2ffK0/gZGzeNmBm1CxzOQV92pL+bDI61ZfkYEqAjouG68G af+UDJjiPnL19+wyk/UVg8JtRHY32ZiW9PBW3AnIetwlSAT0YW2WwLi8AOoH6fI8qV2cGwDinYh Z2V6/toGlIk8sgg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add a benchmark suite for testing IOCB_DONTCACHE on local filesystems via fio's io_uring engine with the RWF_DONTCACHE flag. The suite mirrors the nfsd-io-bench test matrix but uses io_uring with the "uncached" fio option instead of NFSD debugfs mode switching: - uncached=3D0: standard buffered I/O - uncached=3D1: RWF_DONTCACHE - Mode 2 uses O_DIRECT via fio's --direct=3D1 Includes fio job files, run-benchmarks.sh, and parse-results.sh. Signed-off-by: Jeff Layton --- .../dontcache-bench/fio-jobs/lat-reader.fio | 12 + .../dontcache-bench/fio-jobs/multi-write.fio | 9 + .../dontcache-bench/fio-jobs/noisy-writer.fio | 12 + .../testing/dontcache-bench/fio-jobs/rand-read.fio | 13 + .../dontcache-bench/fio-jobs/rand-write.fio | 13 + .../testing/dontcache-bench/fio-jobs/seq-read.fio | 13 + .../testing/dontcache-bench/fio-jobs/seq-write.fio | 13 + .../dontcache-bench/scripts/parse-results.sh | 238 +++++++++ .../dontcache-bench/scripts/run-benchmarks.sh | 562 +++++++++++++++++= ++++ 9 files changed, 885 insertions(+) diff --git a/tools/testing/dontcache-bench/fio-jobs/lat-reader.fio b/tools/= testing/dontcache-bench/fio-jobs/lat-reader.fio new file mode 100644 index 000000000000..e221e7aedec9 --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/lat-reader.fio @@ -0,0 +1,12 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D4k +numjobs=3D1 +time_based=3D0 +rw=3Dread +log_avg_msec=3D1000 +write_bw_log=3Dlatreader +write_lat_log=3Dlatreader + +[latreader] diff --git a/tools/testing/dontcache-bench/fio-jobs/multi-write.fio b/tools= /testing/dontcache-bench/fio-jobs/multi-write.fio new file mode 100644 index 000000000000..8fc0770f5860 --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/multi-write.fio @@ -0,0 +1,9 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D1M +numjobs=3D1 +time_based=3D0 +rw=3Dwrite + +[multiwrite] diff --git a/tools/testing/dontcache-bench/fio-jobs/noisy-writer.fio b/tool= s/testing/dontcache-bench/fio-jobs/noisy-writer.fio new file mode 100644 index 000000000000..4524eebd4642 --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/noisy-writer.fio @@ -0,0 +1,12 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D1M +numjobs=3D1 +time_based=3D0 +rw=3Dwrite +log_avg_msec=3D1000 +write_bw_log=3Dnoisywriter +write_lat_log=3Dnoisywriter + +[noisywriter] diff --git a/tools/testing/dontcache-bench/fio-jobs/rand-read.fio b/tools/t= esting/dontcache-bench/fio-jobs/rand-read.fio new file mode 100644 index 000000000000..e281fa82b86a --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/rand-read.fio @@ -0,0 +1,13 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D4k +numjobs=3D1 +iodepth=3D16 +time_based=3D0 +rw=3Drandread +log_avg_msec=3D1000 +write_bw_log=3Drandread +write_lat_log=3Drandread + +[randread] diff --git a/tools/testing/dontcache-bench/fio-jobs/rand-write.fio b/tools/= testing/dontcache-bench/fio-jobs/rand-write.fio new file mode 100644 index 000000000000..cf53bc6f14b9 --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/rand-write.fio @@ -0,0 +1,13 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D4k +numjobs=3D1 +iodepth=3D16 +time_based=3D0 +rw=3Drandwrite +log_avg_msec=3D1000 +write_bw_log=3Drandwrite +write_lat_log=3Drandwrite + +[randwrite] diff --git a/tools/testing/dontcache-bench/fio-jobs/seq-read.fio b/tools/te= sting/dontcache-bench/fio-jobs/seq-read.fio new file mode 100644 index 000000000000..ef87921465a7 --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/seq-read.fio @@ -0,0 +1,13 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D1M +numjobs=3D1 +iodepth=3D16 +time_based=3D0 +rw=3Dread +log_avg_msec=3D1000 +write_bw_log=3Dseqread +write_lat_log=3Dseqread + +[seqread] diff --git a/tools/testing/dontcache-bench/fio-jobs/seq-write.fio b/tools/t= esting/dontcache-bench/fio-jobs/seq-write.fio new file mode 100644 index 000000000000..da3082f9b391 --- /dev/null +++ b/tools/testing/dontcache-bench/fio-jobs/seq-write.fio @@ -0,0 +1,13 @@ +[global] +ioengine=3Dio_uring +direct=3D0 +bs=3D1M +numjobs=3D1 +iodepth=3D16 +time_based=3D0 +rw=3Dwrite +log_avg_msec=3D1000 +write_bw_log=3Dseqwrite +write_lat_log=3Dseqwrite + +[seqwrite] diff --git a/tools/testing/dontcache-bench/scripts/parse-results.sh b/tools= /testing/dontcache-bench/scripts/parse-results.sh new file mode 100755 index 000000000000..0427d411db04 --- /dev/null +++ b/tools/testing/dontcache-bench/scripts/parse-results.sh @@ -0,0 +1,238 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Parse fio JSON output and generate comparison tables. +# +# Usage: ./parse-results.sh + +set -euo pipefail + +if [ $# -lt 1 ]; then + echo "Usage: $0 " + exit 1 +fi + +RESULTS_DIR=3D"$1" + +if ! command -v jq &>/dev/null; then + echo "ERROR: jq is required" + exit 1 +fi + +# Extract metrics from a single fio JSON result +extract_metrics() { + local json_file=3D$1 + local rw_type=3D$2 # read or write + + if [ ! -f "$json_file" ]; then + echo "N/A N/A N/A N/A N/A N/A" + return + fi + + jq -r --arg rw "$rw_type" ' + .jobs[0][$rw] as $d | + [ + (($d.bw // 0) / 1024 | . * 10 | round / 10), # MB/s + ($d.iops // 0), # IOPS + ((($d.clat_ns.mean // 0) / 1000) | . * 10 | round / 10), # avg lat us + (($d.clat_ns.percentile["50.000000"] // 0) / 1000), # p50 us + (($d.clat_ns.percentile["99.000000"] // 0) / 1000), # p99 us + (($d.clat_ns.percentile["99.900000"] // 0) / 1000) # p99.9 us + ] | @tsv + ' "$json_file" 2>/dev/null || echo "N/A N/A N/A N/A N/A N/A" +} + +# Extract server CPU from vmstat log (average sys%) +extract_cpu() { + local vmstat_log=3D$1 + if [ ! -f "$vmstat_log" ]; then + echo "N/A" + return + fi + # vmstat columns: us sy id wa st =E2=80=94 skip header lines + awk 'NR>2 {sum+=3D$14; n++} END {if(n>0) printf "%.1f", sum/n; else print= "N/A"}' \ + "$vmstat_log" 2>/dev/null || echo "N/A" +} + +# Extract peak dirty pages from meminfo log +extract_peak_dirty() { + local meminfo_log=3D$1 + if [ ! -f "$meminfo_log" ]; then + echo "N/A" + return + fi + grep "^Dirty:" "$meminfo_log" | awk '{print $2}' | sort -n | tail -1 || e= cho "N/A" +} + +# Extract peak cached from meminfo log +extract_peak_cached() { + local meminfo_log=3D$1 + if [ ! -f "$meminfo_log" ]; then + echo "N/A" + return + fi + grep "^Cached:" "$meminfo_log" | awk '{print $2}' | sort -n | tail -1 || = echo "N/A" +} + +print_separator() { + printf '%*s\n' 120 '' | tr ' ' '-' +} + +######################################################################## +# Deliverable 1: Single-client results +######################################################################## +echo "" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo " Deliverable 1: Single-Client fio Benchmarks" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo "" + +for workload in seq-write rand-write seq-read rand-read; do + case $workload in + seq-write|rand-write) rw_type=3D"write" ;; + seq-read|rand-read) rw_type=3D"read" ;; + esac + + echo "--- $workload ---" + printf "%-16s %10s %10s %10s %10s %10s %10s %10s %12s %12s\n" \ + "Mode" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" "Sys CPU%= " "PeakDirty(kB)" "PeakCache(kB)" + print_separator + + for mode in buffered dontcache direct; do + dir=3D"${RESULTS_DIR}/${workload}/${mode}" + json_file=3D$(find "$dir" -name '*.json' -not -name 'client*' 2>/dev/nul= l | head -1 || true) + if [ -z "$json_file" ]; then + printf "%-16s %10s\n" "$mode" "(no data)" + continue + fi + + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "$rw_type")" + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + cached=3D$(extract_peak_cached "${dir}/meminfo.log") + + printf "%-16s %10s %10s %10s %10s %10s %10s %10s %12s %12s\n" \ + "$mode" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" \ + "$cpu" "${dirty:-N/A}" "${cached:-N/A}" + done + echo "" +done + +######################################################################## +# Deliverable 2: Multi-client results +######################################################################## +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo " Deliverable 2: Noisy-Neighbor Benchmarks" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo "" + +# Scenario A: Multiple writers +echo "--- Scenario A: Multiple Writers ---" +for mode in buffered dontcache direct; do + dir=3D"${RESULTS_DIR}/multi-write/${mode}" + if [ ! -d "$dir" ]; then + continue + fi + + echo " Mode: $mode" + printf " %-10s %10s %10s %10s %10s %10s %10s\n" \ + "Client" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" + + total_bw=3D0 + count=3D0 + for json_file in "${dir}"/client*.json; do + [ -f "$json_file" ] || continue + client=3D$(basename "$json_file" .json) + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "write")" + printf " %-10s %10s %10s %10s %10s %10s %10s\n" \ + "$client" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + total_bw=3D$(echo "$total_bw + ${mbps:-0}" | bc 2>/dev/null || echo "$to= tal_bw") + count=3D$(( count + 1 )) + done + + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + printf " Aggregate BW: %s MB/s | Sys CPU: %s%% | Peak Dirty: %s kB\n" \ + "$total_bw" "$cpu" "${dirty:-N/A}" + echo "" +done + +# Scenario C: Noisy neighbor +echo "--- Scenario C: Noisy Writer + Latency-Sensitive Readers ---" +for mode in buffered dontcache direct; do + dir=3D"${RESULTS_DIR}/noisy-neighbor/${mode}" + if [ ! -d "$dir" ]; then + continue + fi + + echo " Mode: $mode" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Job" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" + + # Writer + if [ -f "${dir}/noisy_writer.json" ]; then + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "${dir}/noisy_writer.json" "write")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Bulk writer" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + fi + + # Readers + for json_file in "${dir}"/reader*.json; do + [ -f "$json_file" ] || continue + reader=3D$(basename "$json_file" .json) + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "read")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "$reader" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + done + + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + printf " Sys CPU: %s%% | Peak Dirty: %s kB\n" "$cpu" "${dirty:-N/A}" + echo "" +done + +# Scenario D: Mixed-mode noisy neighbor +echo "--- Scenario D: Mixed-Mode Noisy Writer + Readers ---" +for dir in "${RESULTS_DIR}"/noisy-neighbor-mixed/*/; do + [ -d "$dir" ] || continue + label=3D$(basename "$dir") + + echo " Mode: $label" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Job" "MB/s" "IOPS" "Avg(us)" "p50(us)" "p99(us)" "p99.9(us)" + + # Writer + if [ -f "${dir}/noisy_writer.json" ]; then + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "${dir}/noisy_writer.json" "write")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "Bulk writer" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + fi + + # Readers + for json_file in "${dir}"/reader*.json; do + [ -f "$json_file" ] || continue + reader=3D$(basename "$json_file" .json) + read -r mbps iops avg_lat p50 p99 p999 <<< \ + "$(extract_metrics "$json_file" "read")" + printf " %-14s %10s %10s %10s %10s %10s %10s\n" \ + "$reader" "$mbps" "$iops" "$avg_lat" "$p50" "$p99" "$p999" + done + + cpu=3D$(extract_cpu "${dir}/vmstat.log") + dirty=3D$(extract_peak_dirty "${dir}/meminfo.log") + printf " Sys CPU: %s%% | Peak Dirty: %s kB\n" "$cpu" "${dirty:-N/A}" + echo "" +done + +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +echo " System Info" +echo "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +if [ -f "${RESULTS_DIR}/sysinfo.txt" ]; then + head -6 "${RESULTS_DIR}/sysinfo.txt" +fi +echo "" diff --git a/tools/testing/dontcache-bench/scripts/run-benchmarks.sh b/tool= s/testing/dontcache-bench/scripts/run-benchmarks.sh new file mode 100755 index 000000000000..11bf400ef092 --- /dev/null +++ b/tools/testing/dontcache-bench/scripts/run-benchmarks.sh @@ -0,0 +1,562 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Local filesystem I/O mode benchmark suite. +# +# Runs the same test matrix as run-benchmarks.sh but on a local filesystem +# using fio's io_uring engine with the RWF_DONTCACHE flag instead of NFSD's +# debugfs mode knobs. +# +# Usage: ./run-local-benchmarks.sh [options] +# -t Test directory (must be on a filesystem supporting FOP_DON= TCACHE) +# -s File size (default: auto-sized to exceed RAM) +# -f Path to fio binary (default: fio in PATH) +# -o Output directory for results (default: ./results/) +# -d Dry run (print commands without executing) + +set -euo pipefail + +# Defaults +TEST_DIR=3D"" +SIZE=3D"" +FIO_BIN=3D"fio" +RESULTS_DIR=3D"" +DRY_RUN=3D0 +MODES=3D"0 1 2" +PERF_LOCK=3D0 +SCRIPT_DIR=3D"$(cd "$(dirname "$0")" && pwd)" +FIO_JOBS_DIR=3D"${SCRIPT_DIR}/../fio-jobs" + +usage() { + echo "Usage: $0 -t [-s ] [-f ] [-o ] [-D] [-p] [-d]" + echo "" + echo " -t Test directory (required, must support RWF_DONTCACHE)" + echo " -s File size (default: 2x RAM)" + echo " -f Path to fio binary (default: fio)" + echo " -o Output directory (default: ./results/)" + echo " -D Dontcache only (skip buffered and direct tests)" + echo " -p Profile kernel lock contention with perf lock" + echo " -d Dry run" + exit 1 +} + +while getopts "t:s:f:o:Dpdh" opt; do + case $opt in + t) TEST_DIR=3D"$OPTARG" ;; + s) SIZE=3D"$OPTARG" ;; + f) FIO_BIN=3D"$OPTARG" ;; + o) RESULTS_DIR=3D"$OPTARG" ;; + D) MODES=3D"1" ;; + p) PERF_LOCK=3D1 ;; + d) DRY_RUN=3D1 ;; + h) usage ;; + *) usage ;; + esac +done + +if [ -z "$TEST_DIR" ]; then + echo "ERROR: -t is required" + usage +fi + +# Auto-size to 2x RAM if not specified +if [ -z "$SIZE" ]; then + mem_kb=3D$(awk '/MemTotal/ {print $2}' /proc/meminfo) + SIZE=3D"$(( mem_kb * 2 / 1024 ))M" +fi + +if [ -z "$RESULTS_DIR" ]; then + RESULTS_DIR=3D"./results/local-$(date +%Y%m%d-%H%M%S)" +fi + +mkdir -p "$RESULTS_DIR" + +log() { + echo "[$(date '+%H:%M:%S')] $*" +} + +run_cmd() { + if [ "$DRY_RUN" -eq 1 ]; then + echo " [DRY RUN] $*" + else + "$@" + fi +} + +# I/O mode definitions: +# buffered: direct=3D0, uncached=3D0 +# dontcache: direct=3D0, uncached=3D1 +# direct: direct=3D1, uncached=3D0 +# +# Mode name from numeric value +mode_name() { + case $1 in + 0) echo "buffered" ;; + 1) echo "dontcache" ;; + 2) echo "direct" ;; + esac +} + +# Return fio command-line flags for a given mode. +# "direct" is a standard fio option and works on the command line. +# "uncached" is an io_uring engine option that must be in the job file, +# so we inject it via make_job_file() below. +mode_fio_args() { + case $1 in + 0) echo "--direct=3D0" ;; # buffered + 1) echo "--direct=3D0" ;; # dontcache + 2) echo "--direct=3D1" ;; # direct + esac +} + +# Return the uncached=3D value for a given mode. +mode_uncached() { + case $1 in + 0) echo "0" ;; + 1) echo "1" ;; + 2) echo "0" ;; + esac +} + +# Create a temporary job file with uncached=3DN injected into [global]. +# For uncached=3D0 (buffered/direct), return the original file unchanged. +make_job_file() { + local job_file=3D$1 + local uncached=3D$2 + + if [ "$uncached" -eq 0 ]; then + echo "$job_file" + return + fi + + local tmp + tmp=3D$(mktemp) + sed "/^\[global\]/a uncached=3D${uncached}" "$job_file" > "$tmp" + echo "$tmp" +} + +drop_caches() { + run_cmd bash -c "sync && echo 3 > /proc/sys/vm/drop_caches" +} + +# perf lock profiling =E2=80=94 uses BPF-based live contention tracing +PERF_LOCK_PID=3D"" + +start_perf_lock() { + local outdir=3D$1 + + if [ "$PERF_LOCK" -ne 1 ]; then + return + fi + + log "Starting perf lock contention tracing" + perf lock contention -a -b --max-stack 8 \ + > "${outdir}/perf-lock-contention.txt" 2>&1 & + PERF_LOCK_PID=3D$! +} + +stop_perf_lock() { + local outdir=3D$1 + + if [ -z "$PERF_LOCK_PID" ]; then + return + fi + + log "Stopping perf lock contention tracing" + kill -TERM "$PERF_LOCK_PID" 2>/dev/null || true + wait "$PERF_LOCK_PID" 2>/dev/null || true + PERF_LOCK_PID=3D"" +} + +# Background monitors +VMSTAT_PID=3D"" +IOSTAT_PID=3D"" +MEMINFO_PID=3D"" + +start_monitors() { + local outdir=3D$1 + log "Starting monitors in $outdir" + run_cmd vmstat 1 > "${outdir}/vmstat.log" 2>&1 & + VMSTAT_PID=3D$! + run_cmd iostat -x 1 > "${outdir}/iostat.log" 2>&1 & + IOSTAT_PID=3D$! + (while true; do + echo "=3D=3D=3D $(date '+%s') =3D=3D=3D" + cat /proc/meminfo + sleep 1 + done) > "${outdir}/meminfo.log" 2>&1 & + MEMINFO_PID=3D$! +} + +stop_monitors() { + log "Stopping monitors" + kill "$VMSTAT_PID" "$IOSTAT_PID" "$MEMINFO_PID" 2>/dev/null || true + wait "$VMSTAT_PID" "$IOSTAT_PID" "$MEMINFO_PID" 2>/dev/null || true +} + +cleanup_test_files() { + local filepath=3D"${TEST_DIR}/$1" + log "Cleaning up $filepath" + run_cmd rm -f "$filepath" +} + +# Run a single fio benchmark +run_fio() { + local job_file=3D$1 + local outdir=3D$2 + local filename=3D$3 + local fio_size=3D${4:-$SIZE} + local keep=3D${5:-} + local extra_args=3D${6:-} + local uncached=3D${7:-0} + + # Inject uncached=3DN into the job file if needed + local actual_job + actual_job=3D$(make_job_file "$job_file" "$uncached") + + local job_name + job_name=3D$(basename "$job_file" .fio) + + log "Running fio job: $job_name -> $outdir (file=3D${TEST_DIR}/$filename = size=3D$fio_size)" + mkdir -p "$outdir" + + drop_caches + start_monitors "$outdir" + # Skip perf lock profiling for precreate/setup runs + [ "$keep" !=3D "keep" ] && start_perf_lock "$outdir" + + # shellcheck disable=3DSC2086 + run_cmd "$FIO_BIN" "$actual_job" \ + --output-format=3Djson \ + --output=3D"${outdir}/${job_name}.json" \ + --filename=3D"${TEST_DIR}/$filename" \ + --size=3D"$fio_size" \ + $extra_args + + [ "$keep" !=3D "keep" ] && stop_perf_lock "$outdir" + stop_monitors + log "Finished: $job_name" + + # Clean up temp job file if one was created + [ "$actual_job" !=3D "$job_file" ] && rm -f "$actual_job" + + if [ "$keep" !=3D "keep" ]; then + cleanup_test_files "$filename" + fi +} + +######################################################################## +# Preflight +######################################################################## +preflight() { + log "=3D=3D=3D Preflight checks =3D=3D=3D" + + if ! command -v "$FIO_BIN" &>/dev/null; then + echo "ERROR: fio not found at $FIO_BIN" + exit 1 + fi + + if [ ! -d "$TEST_DIR" ]; then + echo "ERROR: Test directory $TEST_DIR does not exist" + exit 1 + fi + + # Quick check that RWF_DONTCACHE works on this filesystem + local testfile=3D"${TEST_DIR}/.dontcache_test" + if ! "$FIO_BIN" --name=3Dtest --ioengine=3Dio_uring --rw=3Dwrite \ + --bs=3D4k --size=3D4k --direct=3D0 --uncached=3D1 \ + --filename=3D"$testfile" 2>/dev/null; then + echo "WARNING: RWF_DONTCACHE may not be supported on $TEST_DIR" + echo " (filesystem must support FOP_DONTCACHE)" + fi + rm -f "$testfile" + + log "Test directory: $TEST_DIR" + log "File size: $SIZE" + log "fio binary: $FIO_BIN" + log "Results: $RESULTS_DIR" + + # Record system info + { + echo "Timestamp: $(date +%Y%m%d-%H%M%S)" + echo "Kernel: $(uname -r)" + echo "Hostname: $(hostname)" + echo "Filesystem: $(df -T "$TEST_DIR" | tail -1 | awk '{print $2}')" + echo "File size: $SIZE" + echo "Test dir: $TEST_DIR" + } > "${RESULTS_DIR}/sysinfo.txt" +} + +######################################################################## +# Deliverable 1: Single-client benchmarks +######################################################################## +run_deliverable1() { + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + log "Deliverable 1: Single-client benchmarks" + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + + # Sequential write + for mode in $MODES; do + local mname + mname=3D$(mode_name $mode) + local fio_args + fio_args=3D$(mode_fio_args $mode) + + drop_caches + run_fio "${FIO_JOBS_DIR}/seq-write.fio" \ + "${RESULTS_DIR}/seq-write/${mname}" \ + "seq-write_testfile" "$SIZE" "" "$fio_args" \ + "$(mode_uncached $mode)" + done + + # Random write + for mode in $MODES; do + local mname + mname=3D$(mode_name $mode) + local fio_args + fio_args=3D$(mode_fio_args $mode) + + drop_caches + run_fio "${FIO_JOBS_DIR}/rand-write.fio" \ + "${RESULTS_DIR}/rand-write/${mname}" \ + "rand-write_testfile" "$SIZE" "" "$fio_args" \ + "$(mode_uncached $mode)" + done + + # Sequential read =E2=80=94 pre-create file, then read with each mode + log "Pre-creating sequential read test file" + run_fio "${FIO_JOBS_DIR}/seq-write.fio" \ + "${RESULTS_DIR}/seq-read/precreate" \ + "seq-read_testfile" "$SIZE" "keep" + + for rmode in $MODES; do + local mname + mname=3D$(mode_name $rmode) + local fio_args + fio_args=3D$(mode_fio_args $rmode) + local keep=3D"keep" + [ "$rmode" -eq 2 ] && keep=3D"" + + drop_caches + run_fio "${FIO_JOBS_DIR}/seq-read.fio" \ + "${RESULTS_DIR}/seq-read/${mname}" \ + "seq-read_testfile" "$SIZE" "$keep" "$fio_args" \ + "$(mode_uncached $rmode)" + done + + # Random read =E2=80=94 pre-create file, then read with each mode + log "Pre-creating random read test file" + run_fio "${FIO_JOBS_DIR}/seq-write.fio" \ + "${RESULTS_DIR}/rand-read/precreate" \ + "rand-read_testfile" "$SIZE" "keep" + + for rmode in $MODES; do + local mname + mname=3D$(mode_name $rmode) + local fio_args + fio_args=3D$(mode_fio_args $rmode) + local keep=3D"keep" + [ "$rmode" -eq 2 ] && keep=3D"" + + drop_caches + run_fio "${FIO_JOBS_DIR}/rand-read.fio" \ + "${RESULTS_DIR}/rand-read/${mname}" \ + "rand-read_testfile" "$SIZE" "$keep" "$fio_args" \ + "$(mode_uncached $rmode)" + done +} + +######################################################################## +# Deliverable 2: Multi-client tests +######################################################################## +run_deliverable2() { + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + log "Deliverable 2: Noisy-neighbor benchmarks" + log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" + + local num_clients=3D4 + local client_size + local mem_kb + mem_kb=3D$(awk '/MemTotal/ {print $2}' /proc/meminfo) + client_size=3D"$(( mem_kb / 1024 / num_clients ))M" + + # Scenario A: Multiple writers + for mode in $MODES; do + local mname + mname=3D$(mode_name $mode) + local fio_args + fio_args=3D$(mode_fio_args $mode) + local uncached + uncached=3D$(mode_uncached $mode) + local actual_job + actual_job=3D$(make_job_file "${FIO_JOBS_DIR}/multi-write.fio" "$uncache= d") + local outdir=3D"${RESULTS_DIR}/multi-write/${mname}" + mkdir -p "$outdir" + + drop_caches + start_monitors "$outdir" + start_perf_lock "$outdir" + + local pids=3D() + for i in $(seq 1 $num_clients); do + # shellcheck disable=3DSC2086 + run_cmd "$FIO_BIN" "$actual_job" \ + --output-format=3Djson \ + --output=3D"${outdir}/client${i}.json" \ + --filename=3D"${TEST_DIR}/client${i}_testfile" \ + --size=3D"$client_size" \ + $fio_args & + pids+=3D($!) + done + + local rc=3D0 + for pid in "${pids[@]}"; do + wait "$pid" || rc=3D$? + done + + stop_perf_lock "$outdir" + stop_monitors + [ $rc -ne 0 ] && log "WARNING: some fio jobs exited non-zero" + + [ "$actual_job" !=3D "${FIO_JOBS_DIR}/multi-write.fio" ] && rm -f "$actu= al_job" + for i in $(seq 1 $num_clients); do + cleanup_test_files "client${i}_testfile" + done + done + + # Scenario C: Noisy writer + latency-sensitive readers + for mode in $MODES; do + local mname + mname=3D$(mode_name $mode) + local fio_args + fio_args=3D$(mode_fio_args $mode) + local uncached + uncached=3D$(mode_uncached $mode) + local writer_job + writer_job=3D$(make_job_file "${FIO_JOBS_DIR}/noisy-writer.fio" "$uncach= ed") + local reader_job + reader_job=3D$(make_job_file "${FIO_JOBS_DIR}/lat-reader.fio" "$uncached= ") + local outdir=3D"${RESULTS_DIR}/noisy-neighbor/${mname}" + mkdir -p "$outdir" + + # Pre-create read files + for i in $(seq 1 $(( num_clients - 1 ))); do + log "Pre-creating read file for reader $i" + run_fio "${FIO_JOBS_DIR}/multi-write.fio" \ + "${outdir}/precreate_reader${i}" \ + "reader${i}_readfile" \ + "512M" "keep" + done + drop_caches + start_monitors "$outdir" + start_perf_lock "$outdir" + + # Noisy writer + # shellcheck disable=3DSC2086 + run_cmd "$FIO_BIN" "$writer_job" \ + --output-format=3Djson \ + --output=3D"${outdir}/noisy_writer.json" \ + --filename=3D"${TEST_DIR}/bulk_testfile" \ + --size=3D"$SIZE" \ + $fio_args & + local writer_pid=3D$! + + # Latency-sensitive readers + local reader_pids=3D() + for i in $(seq 1 $(( num_clients - 1 ))); do + # shellcheck disable=3DSC2086 + run_cmd "$FIO_BIN" "$reader_job" \ + --output-format=3Djson \ + --output=3D"${outdir}/reader${i}.json" \ + --filename=3D"${TEST_DIR}/reader${i}_readfile" \ + --size=3D"512M" \ + $fio_args & + reader_pids+=3D($!) + done + + local rc=3D0 + wait "$writer_pid" || rc=3D$? + for pid in "${reader_pids[@]}"; do + wait "$pid" || rc=3D$? + done + + stop_perf_lock "$outdir" + stop_monitors + [ $rc -ne 0 ] && log "WARNING: some fio jobs exited non-zero" + + [ "$writer_job" !=3D "${FIO_JOBS_DIR}/noisy-writer.fio" ] && rm -f "$wri= ter_job" + [ "$reader_job" !=3D "${FIO_JOBS_DIR}/lat-reader.fio" ] && rm -f "$reade= r_job" + cleanup_test_files "bulk_testfile" + for i in $(seq 1 $(( num_clients - 1 ))); do + cleanup_test_files "reader${i}_readfile" + done + done + + # Scenario D: Mixed-mode noisy neighbor + # dontcache writes + buffered reads + local outdir=3D"${RESULTS_DIR}/noisy-neighbor-mixed/dontcache-w_buffered-= r" + mkdir -p "$outdir" + local writer_job + writer_job=3D$(make_job_file "${FIO_JOBS_DIR}/noisy-writer.fio" 1) + + for i in $(seq 1 $(( num_clients - 1 ))); do + log "Pre-creating read file for reader $i" + run_fio "${FIO_JOBS_DIR}/multi-write.fio" \ + "${outdir}/precreate_reader${i}" \ + "reader${i}_readfile" \ + "512M" "keep" + done + drop_caches + start_monitors "$outdir" + start_perf_lock "$outdir" + + # Writer with dontcache + run_cmd "$FIO_BIN" "$writer_job" \ + --output-format=3Djson \ + --output=3D"${outdir}/noisy_writer.json" \ + --filename=3D"${TEST_DIR}/bulk_testfile" \ + --size=3D"$SIZE" \ + --direct=3D0 & + local writer_pid=3D$! + + # Readers with buffered (no uncached flag) + local reader_pids=3D() + for i in $(seq 1 $(( num_clients - 1 ))); do + run_cmd "$FIO_BIN" "${FIO_JOBS_DIR}/lat-reader.fio" \ + --output-format=3Djson \ + --output=3D"${outdir}/reader${i}.json" \ + --filename=3D"${TEST_DIR}/reader${i}_readfile" \ + --size=3D"512M" \ + --direct=3D0 & + reader_pids+=3D($!) + done + + local rc=3D0 + wait "$writer_pid" || rc=3D$? + for pid in "${reader_pids[@]}"; do + wait "$pid" || rc=3D$? + done + + stop_perf_lock "$outdir" + stop_monitors + [ $rc -ne 0 ] && log "WARNING: some fio jobs exited non-zero" + + [ "$writer_job" !=3D "${FIO_JOBS_DIR}/noisy-writer.fio" ] && rm -f "$writ= er_job" + cleanup_test_files "bulk_testfile" + for i in $(seq 1 $(( num_clients - 1 ))); do + cleanup_test_files "reader${i}_readfile" + done +} + +######################################################################## +# Main +######################################################################## +preflight +run_deliverable1 +run_deliverable2 + +log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" +log "All benchmarks complete." +log "Results in: $RESULTS_DIR" +log "Parse with: scripts/parse-results.sh $RESULTS_DIR" +log "=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D" --=20 2.53.0