tools/testing/selftests/cgroup/test_cpu.c | 42 ++++++++++++++++++------------- 1 file changed, 24 insertions(+), 18 deletions(-)
cpu.max selftests (both the normal one and the nested one) test the working of throttling by setting up cpu.max, running a cpu hog process for a specified duration, and comparing usage_usec as reported by cpu.stat with the duration of the cpu hog: they should be far enough. Currently, this is done by using values_close, which has two problems: 1. Semantic: values_close is used with an error percentage of 95%, which one will not expect on seeing "values close". The intent it's actually going for is "values far". 2. Accuracy: the tests can pass even if usage_usec is upto around double the expected amount. That's too high of a margin for usage_usec. Overall, this patchset improves the readability and accuracy of the cpu.max tests. Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com> --- Shashank Balaji (2): selftests/cgroup: rename `expected` to `duration` in cpu.max tests selftests/cgroup: better bound in cpu.max tests tools/testing/selftests/cgroup/test_cpu.c | 42 ++++++++++++++++++------------- 1 file changed, 24 insertions(+), 18 deletions(-) --- base-commit: 66701750d5565c574af42bef0b789ce0203e3071 change-id: 20250227-kselftest-cgroup-fix-cpu-max-56619928e99b Best regards, -- Shashank Balaji <shashank.mahadasyam@sony.com>
Hello Shashank. On Tue, Jul 01, 2025 at 11:13:54PM +0900, Shashank Balaji <shashank.mahadasyam@sony.com> wrote: > cpu.max selftests (both the normal one and the nested one) test the > working of throttling by setting up cpu.max, running a cpu hog process > for a specified duration, and comparing usage_usec as reported by > cpu.stat with the duration of the cpu hog: they should be far enough. > > Currently, this is done by using values_close, which has two problems: > > 1. Semantic: values_close is used with an error percentage of 95%, which > one will not expect on seeing "values close". The intent it's > actually going for is "values far". > > 2. Accuracy: the tests can pass even if usage_usec is upto around double > the expected amount. That's too high of a margin for usage_usec. > > Overall, this patchset improves the readability and accuracy of the > cpu.max tests. > > Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com> I think you're getting at an actual bug in the test definition. I think that the test_cpucg_max should either run hog_cpus_timed with CPU_HOG_CLOCK_PROCESS instead of CPU_HOG_CLOCK_WALL to make sense or the expected_usage_usec should be defined with the configured quota in mind (i.e. 1/100). (The latter seems to make the test more natural.) With such defined metrics, the asserted expression could be values_close(usage_usec, expected_usage_usec, 10) based on your numbers, error is around 20% so our helper's argument is roughly half of that. (I'd be fine even with err=20 to prevent some false positives.) I think those changes could even be in one patch but I leave that up to you. My comment to your 2nd patch is that I'd like to stick to relative errors and keep positive values_close() predicate that's used in other selftests too. (But those 95% in the current code are clumsy given two different qualities are compared.) Thanks, Michal
Hi Michal, Thanks for the reply! On Wed, Jul 02, 2025 at 02:34:29PM +0200, Michal Koutný wrote: > Hello Shashank. > > On Tue, Jul 01, 2025 at 11:13:54PM +0900, Shashank Balaji <shashank.mahadasyam@sony.com> wrote: > > cpu.max selftests (both the normal one and the nested one) test the > > working of throttling by setting up cpu.max, running a cpu hog process > > for a specified duration, and comparing usage_usec as reported by > > cpu.stat with the duration of the cpu hog: they should be far enough. > > > > Currently, this is done by using values_close, which has two problems: > > > > 1. Semantic: values_close is used with an error percentage of 95%, which > > one will not expect on seeing "values close". The intent it's > > actually going for is "values far". > > > > 2. Accuracy: the tests can pass even if usage_usec is upto around double > > the expected amount. That's too high of a margin for usage_usec. > > > > Overall, this patchset improves the readability and accuracy of the > > cpu.max tests. > > > > Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com> > > I think you're getting at an actual bug in the test definition. > > I think that the test_cpucg_max should either run hog_cpus_timed with > CPU_HOG_CLOCK_PROCESS instead of CPU_HOG_CLOCK_WALL to make sense or the > expected_usage_usec should be defined with the configured quota in mind > (i.e. 1/100). (The latter seems to make the test more natural.) Going with the more natural way of sticking to CPU_HOG_CLOCK_WALL, the second patch does calculate expected_usage_usec based on the configured quota, as the code comment explains. So I'm guessesing we're on the same page about this? > With such defined metrics, the asserted expression could be > values_close(usage_usec, expected_usage_usec, 10) > based on your numbers, error is around 20% so our helper's argument is > roughly half of that. (I'd be fine even with err=20 to prevent some > false positives.) > > I think those changes could even be in one patch but I leave that up to > you. My comment to your 2nd patch is that I'd like to stick to relative > errors and keep positive values_close() predicate that's used in other > selftests too. (But those 95% in the current code are clumsy given two > different qualities are compared.) Do you mean something like, if (values_close(usage_usec, expected_usage_usec, 10)) goto cleanup; using the positive values_close() predicate. If so, I'm not sure I understand because if usage_usec and expected_usage_usec _are_ close, then we want the test to pass! We should be using the negative predicate. And sure, I'll send v2 as a single patch. Thanks Shashank
On Thu, Jul 03, 2025 at 10:41:17AM +0900, Shashank Balaji <shashank.mahadasyam@sony.com> wrote: > Going with the more natural way of sticking to CPU_HOG_CLOCK_WALL, the > second patch does calculate expected_usage_usec based on the configured > quota, as the code comment explains. So I'm guessesing we're on the same page > about this? Yes, the expected_usage_usec in the 2nd patch is correct. (It'd be nicer if the value calculated from the configured cpu.max and not typed out as a literal that may diverge should be the cpu.max changed in the test.) > Do you mean something like, > > if (values_close(usage_usec, expected_usage_usec, 10)) > goto cleanup; > > using the positive values_close() predicate. If so, I'm not sure I > understand because if usage_usec and expected_usage_usec _are_ close, > then we want the test to pass! We should be using the negative > predicate. I meant to use it the same way like test_memcontrol.c does, i.e. values_close() <=> pass. So codewise it boils down to (a negation, I see why the confusion): if (!values_close(usage_usec, expected_usage_usec, 10)) goto cleanup; ret = KSFT_PASS; Michal
© 2016 - 2025 Red Hat, Inc.