Documentation/admin-guide/index.rst | 1 + Documentation/admin-guide/system-state.rst | 350 +++++++++++++++++++++ 2 files changed, 351 insertions(+) create mode 100644 Documentation/admin-guide/system-state.rst
Add a new system state document to the admin-guide. This document is
intended to be used as a guide on how to gather higher level information
about a system and its run-time activity.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
---
Changes since v1:
-- Addressed review comments
Documentation/admin-guide/index.rst | 1 +
Documentation/admin-guide/system-state.rst | 350 +++++++++++++++++++++
2 files changed, 351 insertions(+)
create mode 100644 Documentation/admin-guide/system-state.rst
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index f475554382e2..541372672c55 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -66,6 +66,7 @@ subsystems expectations will be found here.
:maxdepth: 1
workload-tracing
+ system-state
The rest of this manual consists of various unordered guides on how to
configure specific aspects of kernel behavior to your liking.
diff --git a/Documentation/admin-guide/system-state.rst b/Documentation/admin-guide/system-state.rst
new file mode 100644
index 000000000000..2a6fdf85c35c
--- /dev/null
+++ b/Documentation/admin-guide/system-state.rst
@@ -0,0 +1,350 @@
+.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
+
+===========================================================
+Discovering system calls and features supported on a system
+===========================================================
+
+:Author: Shuah Khan <skhan@linuxfoundation.org>
+:maintained-by: Shuah Khan <skhan@linuxfoundation.org>
+
+Key Points
+==========
+
+ * System state includes system calls, features, static and dynamic
+ modules enabled in the kernel configuration.
+ * Supported system calls and Kernel features are architecture dependent.
+ * auditd, checksyscalls.sh, and get_feat.pl tools can be used to discover
+ static system state.
+ * Understanding Linux kernel hardening configurations options and making
+ sure they are enabled will make a system more secure.
+ * Employing run-time tracing can shed light on the dynamic system state.
+ * Workloads could change the system state by loading and unloading dynamic
+ modules and tuning system parameters.
+
+System State Visualization
+==========================
+
+The kernel system state can be viewed as a combination of static and
+dynamic features and modules. Let’s first define what static and dynamic
+system states are and then explore how we can visualize the static and
+dynamic system parts of the kernel.
+
+Static System View comprises system calls, features, static and dynamic
+modules enabled in the kernel configuration. Supported system calls
+and Kernel features are architecture dependent. System call numbering is
+different on different architectures. We can get the supported system call
+information using auditd utilities.
+
+ausyscall –dump prints out the supported system calls on a system and allows
+mapping syscall names and numbers. You can install the auditd package on
+Debian based systems::
+
+ sudo apt-get install auditd
+
+scripts/checksyscalls.sh can be used to check if current architecture is
+missing any system calls compared to i386.
+
+scripts/get_feat.pl can be used to list the Kernel feature support matrix
+for an architecture.
+
+Dynamic System View comprises system calls, ioctls invoked, and subsystems
+used during the runtime. A workload could load and unload modules and also
+change the dynamic system configuration to suit its needs by tuning system
+parameters.
+
+What is the methodology?
+========================
+
+The first step is gathering the default system state such as the dynamic
+and static modules loaded on the system. lsmod command prints out the
+dynamically loaded modules on a system. Statically configured modules can
+be found in the kernel configuration file.
+
+The next step is discovering system activity during run-time. You can do so
+by enabling event tracing and then running your favorite application. After
+a period of time, gather the event logs, and kernel messages.
+
+Once you have the necessary information, you can extract the system call
+numbers from the event trace log and map them to the supported system calls.
+
+Finding supported system calls
+==============================
+
+As mentioned earlier, ausyscall prints out supported system calls
+on a system and allows mapping syscalls names and numbers::
+
+ ausyscall --dump
+
+You can look for specific system calls as shown in the below::
+
+ ausyscall open
+ open 2
+ mq_open 240
+ openat 257
+ perf_event_open 298
+ open_by_handle_at 304
+ open_tree 428
+ fsopen 430
+ pidfd_open 434
+ openat2 437
+
+ ausyscall time
+
+ getitimer 36
+ setitimer 38
+ gettimeofday 96
+ times 100
+ rt_sigtimedwait 128
+ utime 132
+ adjtimex 159
+ settimeofday 164
+ time 201
+ semtimedop 220
+ timer_create 222
+ timer_settime 223
+ timer_gettime 224
+ timer_getoverrun 225
+ timer_delete 226
+ clock_settime 227
+ clock_gettime 228
+ utimes 235
+ mq_timedsend 242
+ mq_timedreceive 243
+ futimesat 261
+ utimensat 280
+ timerfd_create 283
+ timerfd_settime 286
+ timerfd_gettime 287
+ clock_adjtime 305
+
+Finding unsupported system calls
+================================
+
+As mentioned earlier, scripts/checksyscalls.sh checks missing system calls
+on current architecture compared to i386. Example run::
+
+ checksyscalls.sh gcc
+ warning: #warning syscall mmap2 not implemented [-Wcpp]
+ warning: #warning syscall truncate64 not implemented [-Wcpp]
+ warning: #warning syscall ftruncate64 not implemented [-Wcpp]
+ warning: #warning syscall fcntl64 not implemented [-Wcpp]
+ warning: #warning syscall sendfile64 not implemented [-Wcpp]
+ warning: #warning syscall statfs64 not implemented [-Wcpp]
+ warning: #warning syscall fstatfs64 not implemented [-Wcpp]
+ warning: #warning syscall fadvise64_64 not implemented [-Wcpp]
+
+Let's check this against ausyscall now::
+
+ ausyscall map
+ mmap 9
+ munmap 11
+ mremap 25
+ remap_file_pages 216
+
+ ausyscall trunc
+ truncate 76
+ ftruncate 77
+
+As you can see, ausyscall shows mmap2, truncate64, and ftruncate64 aren't
+implemented on this system. This matches what checksyscalls.sh shows.
+
+Finding supported features
+==========================
+
+scripts/get_feat.pl can be used to list the Kernel feature support matrix
+for an architecture::
+
+ get_feat.pl list
+ get_feat.pl list –arch=arm64 lists
+
+This scripts parses Documentation/features to find the support status
+information. It can be used to validate the contents of the files under
+Documentation/features or simply list them::
+
+ --arch Outputs features for an specific architecture, optionally filtering
+ for a single specific feature.
+ --feat or --feature Output features for a single specific feature.
+
+Here is how you can find if stackprotector and hread-info-in-task features
+are supported::
+
+ scripts/get_feat.pl --arch=arm64 --feat=stackprotector list
+ #
+ # Kernel feature support matrix of the 'arm64' architecture:
+ #
+ debug/ stackprotector : ok | HAVE_STACKPROTECTOR #
+ arch supports compiler driven stack overflow protection
+
+ scripts/get_feat.pl --feat=thread-info-in-task list
+ #
+ # Kernel feature support matrix of the 'x86' architecture:
+ #
+ core/ thread-info-in-task : ok | THREAD_INFO_IN_TASK #
+ arch makes use of the core kernel facility to embed thread_info in
+ task_struct
+
+Finding kernel module status
+============================
+
+lsmod command shows the kernel modules that are currently loaded. This
+program displays the contents of /proc/modules. Let's pick uvcvideo
+module which is found on most laptops::
+
+ lsmod | grep uvc
+ uvcvideo 126976 0
+ videobuf2_vmalloc 20480 1 uvcvideo
+ uvc 16384 1 uvcvideo
+ videobuf2_v4l2 36864 1 uvcvideo
+ videodev 315392 2 videobuf2_v4l2,uvcvideo
+ videobuf2_common 65536 4 videobuf2_vmalloc,videobuf2_v4l2,uvcvideo,videobuf2_memops
+ mc 77824 4 videodev,videobuf2_v4l2,uvcvideo,videobuf2_common
+
+You can see that lsmod shows uvcvideo and the modules it depends on and how
+many modules are using them. videobuf2_common is in use by 4 other modules.
+In other words, this is the reference count for this module and rmmod will
+refuse to unload it as long as the reference count is > 0.
+
+You can get the same information from /proc.modules::
+
+ less /proc/modules | grep uvc
+ uvcvideo 126976 0 - Live 0x0000000000000000
+ videobuf2_vmalloc 20480 1 uvcvideo, Live 0x0000000000000000
+ uvc 16384 1 uvcvideo, Live 0x0000000000000000
+ videobuf2_v4l2 36864 1 uvcvideo, Live 0x0000000000000000
+ videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0x0000000000000000
+ videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0x0000000000000000
+ mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0x0000000000000000
+
+The information is similar with a few more extra fields. The address is the
+base address for the module in kernel virtual memory space. When run as a
+normal user, the address is all zeros. The same command when run as root will
+be as follows::
+
+ sudo less /proc/modules | grep uvc
+ uvcvideo 126976 0 - Live 0xffffffffc1c8b000
+ videobuf2_vmalloc 20480 1 uvcvideo, Live 0xffffffffc167f000
+ uvc 16384 1 uvcvideo, Live 0xffffffffc0ab0000
+ videobuf2_v4l2 36864 1 uvcvideo, Live 0xffffffffc0a28000
+ videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0xffffffffc16e9000
+ videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0xffffffffc094d000
+ mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0xffffffffc15eb000
+
+Let's check what modinfo shows that is important for us::
+
+ /sbin/modinfo uvcvideo
+ filename: /lib/modules/6.3.0-rc2/kernel/drivers/media/usb/uvc/uvcvideo.ko
+ license: GPL
+ description: USB Video Class driver
+ depends: videobuf2-v4l2,videodev,mc,uvc,videobuf2-common,videobuf2-vmalloc
+ retpoline: Y
+ intree: Y
+ name: uvcvideo
+ vermagic: 6.3.0-rc2 SMP preempt mod_unload modversions
+ sig_id: PKCS#7
+ signer: Build time autogenerated kernel key
+
+This tells us that this module is built intree and the signed with a build
+time autogenerated key.
+
+Let's do one last sanity check on the system to see if the following two
+command outputs match::
+
+ ps ax | wc -l
+ ls -d /proc/* | grep [0-9]|wc -l
+
+If they don't match, examine your system closely. kernel rootkits install
+their own ps, find, etc. utilities to mask their activity. The outputs
+match on my system. Do they on yours?
+
+Is my system as secure as it could be?
+======================================
+
+Linux kernel supports several hardening options to make system secure.
+kconfig-hardened-check tool sanity checks kernel configuration for
+security. You can clone the latest kconfig-hardened-check repository::
+
+ git clone https://github.com/a13xp0p0v/kconfig-hardened-check.git
+ cd kconfig-hardened-check
+ bin/kconfig-hardened-check --config <config file> --cmdline /proc/cmdline
+
+This will generate detailed report of kernel security configuration and
+command line options that are enabled (OK) and the ones that aren't (FAIL)
+and a summary line at the end::
+
+ [+] Config check is finished: 'OK' - 100 / 'FAIL' - 100
+
+You will have to analyze the information to determine which options make
+sense to enable on your system.
+
+Understanding system run-time activity
+======================================
+
+Enabling event tracing gives insight into system run-time activity. This is
+a good way to identify which parts of the kernel are used at a higher level
+while system is in and/or while a specific workload/process is running.
+
+Event tracing depends on the CONFIG_EVENT_TRACING option enabled. You can
+enable event tracing before starting workload/process. Event tracing allows
+you to dynamically enable and disable tracing on supported/available events.
+You can find available events, tracers, and filter functions in the following
+files::
+
+ /sys/kernel/debug/tracing/available_events
+ /sys/kernel/debug/tracing/available_filter_functions
+ /sys/kernel/debug/tracing/available_tracers
+
+Now this is how you can enable tracing::
+
+ sudo echo 1 > /sys/kernel/debug/tracing/events/enable
+
+Once the workload/process stops or when you decide you have the status you
+need, you can disable event tracing::
+
+ sudo echo 0 > /sys/kernel/debug/tracing/events/enable
+
+You can find the tracing information in the file::
+
+ /sys/kernel/debug/tracing
+
+Here is the information shown in this file::
+
+ cat trace
+ # tracer: nop
+ #
+ # entries-in-buffer/entries-written: 0/0 #P:16
+ #
+ # _-----=> irqs-off/BH-disabled
+ # / _----=> need-resched
+ # | / _---=> hardirq/softirq
+ # || / _--=> preempt-depth
+ # ||| / _-=> migrate-disable
+ # |||| / delay
+ # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
+ # | | | ||||| | |
+
+
+Analyzing traces
+================
+
+You will be able map the functions to system calls and other kernel features
+to get insight into the overall system activity while a workload/process is
+running.
+
+Map the NR (syscal) numbers from the trace to syscalls from the syscalls dump.
+Categorize system calls and map them to Linux subsystems.
+
+Conclusion
+==========
+
+This document is intended to be used as a guide on how to gather higher level
+information about a system and its run-time activity. The approach described
+in this document helps us get insight into supported system calls, features,
+assess how secure a system is, and its run-time activity.
+
+References
+==========
+
+ * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/checksyscalls.sh
+ * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/get_feat.pl
+ * https://github.com/a13xp0p0v/kconfig-hardened-check
+ * https://docs.kernel.org/trace/index.html
--
2.34.1
Shuah Khan <skhan@linuxfoundation.org> writes: > Add a new system state document to the admin-guide. This document is > intended to be used as a guide on how to gather higher level information > about a system and its run-time activity. > > Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> > --- > Changes since v1: > -- Addressed review comments > > Documentation/admin-guide/index.rst | 1 + > Documentation/admin-guide/system-state.rst | 350 +++++++++++++++++++++ > 2 files changed, 351 insertions(+) > create mode 100644 Documentation/admin-guide/system-state.rst > > diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst > index f475554382e2..541372672c55 100644 > --- a/Documentation/admin-guide/index.rst > +++ b/Documentation/admin-guide/index.rst > @@ -66,6 +66,7 @@ subsystems expectations will be found here. > :maxdepth: 1 > > workload-tracing > + system-state > > The rest of this manual consists of various unordered guides on how to > configure specific aspects of kernel behavior to your liking. > diff --git a/Documentation/admin-guide/system-state.rst b/Documentation/admin-guide/system-state.rst > new file mode 100644 > index 000000000000..2a6fdf85c35c > --- /dev/null > +++ b/Documentation/admin-guide/system-state.rst > @@ -0,0 +1,350 @@ > +.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0) > + > +=========================================================== > +Discovering system calls and features supported on a system > +=========================================================== > + > +:Author: Shuah Khan <skhan@linuxfoundation.org> > +:maintained-by: Shuah Khan <skhan@linuxfoundation.org> Rather than adding lines like this, I think everybody would be better served with a MAINTAINERS file entry. get_maintainer.pl doesn't know about these lines. > +Key Points > +========== > + > + * System state includes system calls, features, static and dynamic > + modules enabled in the kernel configuration. > + * Supported system calls and Kernel features are architecture dependent. > + * auditd, checksyscalls.sh, and get_feat.pl tools can be used to discover > + static system state. > + * Understanding Linux kernel hardening configurations options and making > + sure they are enabled will make a system more secure. > + * Employing run-time tracing can shed light on the dynamic system state. > + * Workloads could change the system state by loading and unloading dynamic > + modules and tuning system parameters. So what I'm missing, before this even, is a paragraph saying what this document is actually for. Who is the intended audience, and why might they want to read this document? > +System State Visualization > +========================== > + > +The kernel system state can be viewed as a combination of static and > +dynamic features and modules. Let’s first define what static and dynamic > +system states are and then explore how we can visualize the static and > +dynamic system parts of the kernel. > + > +Static System View comprises system calls, features, static and dynamic > +modules enabled in the kernel configuration. Supported system calls So the "static system view" includes *dynamic* modules? Fine if that's what you intended, but it reads a bit strangely. > +and Kernel features are architecture dependent. System call numbering is > +different on different architectures. We can get the supported system call > +information using auditd utilities. > + > +ausyscall –dump prints out the supported system calls on a system and allows Some clever software turned your "--" into an em-dash here. > +mapping syscall names and numbers. You can install the auditd package on > +Debian based systems:: > + > + sudo apt-get install auditd > + > +scripts/checksyscalls.sh can be used to check if current architecture is > +missing any system calls compared to i386. > + > +scripts/get_feat.pl can be used to list the Kernel feature support matrix > +for an architecture. > + > +Dynamic System View comprises system calls, ioctls invoked, and subsystems > +used during the runtime. A workload could load and unload modules and also > +change the dynamic system configuration to suit its needs by tuning system > +parameters. > + > +What is the methodology? > +======================== > + > +The first step is gathering the default system state such as the dynamic > +and static modules loaded on the system. lsmod command prints out the *The* lsmod command > +dynamically loaded modules on a system. Statically configured modules can > +be found in the kernel configuration file. > + > +The next step is discovering system activity during run-time. You can do so > +by enabling event tracing and then running your favorite application. After > +a period of time, gather the event logs, and kernel messages. Might your intended readers need a hint on enabling tracing? A cross reference to the appropriate docs if nothing else. [Later I see you get to this; adding an "as described below" would help here.] > +Once you have the necessary information, you can extract the system call > +numbers from the event trace log and map them to the supported system calls. > + > +Finding supported system calls > +============================== > + > +As mentioned earlier, ausyscall prints out supported system calls > +on a system and allows mapping syscalls names and numbers:: > + > + ausyscall --dump > + > +You can look for specific system calls as shown in the below:: > + > + ausyscall open > + open 2 > + mq_open 240 > + openat 257 > + perf_event_open 298 > + open_by_handle_at 304 > + open_tree 428 > + fsopen 430 > + pidfd_open 434 > + openat2 437 > + > + ausyscall time > + > + getitimer 36 > + setitimer 38 > + gettimeofday 96 > + times 100 > + rt_sigtimedwait 128 > + utime 132 > + adjtimex 159 > + settimeofday 164 > + time 201 > + semtimedop 220 > + timer_create 222 > + timer_settime 223 > + timer_gettime 224 > + timer_getoverrun 225 > + timer_delete 226 > + clock_settime 227 > + clock_gettime 228 > + utimes 235 > + mq_timedsend 242 > + mq_timedreceive 243 > + futimesat 261 > + utimensat 280 > + timerfd_create 283 > + timerfd_settime 286 > + timerfd_gettime 287 > + clock_adjtime 305 > + > +Finding unsupported system calls > +================================ > + > +As mentioned earlier, scripts/checksyscalls.sh checks missing system calls > +on current architecture compared to i386. Example run:: > + > + checksyscalls.sh gcc > + warning: #warning syscall mmap2 not implemented [-Wcpp] > + warning: #warning syscall truncate64 not implemented [-Wcpp] > + warning: #warning syscall ftruncate64 not implemented [-Wcpp] > + warning: #warning syscall fcntl64 not implemented [-Wcpp] > + warning: #warning syscall sendfile64 not implemented [-Wcpp] > + warning: #warning syscall statfs64 not implemented [-Wcpp] > + warning: #warning syscall fstatfs64 not implemented [-Wcpp] > + warning: #warning syscall fadvise64_64 not implemented [-Wcpp] > + > +Let's check this against ausyscall now:: > + > + ausyscall map > + mmap 9 > + munmap 11 > + mremap 25 > + remap_file_pages 216 > + > + ausyscall trunc > + truncate 76 > + ftruncate 77 > + > +As you can see, ausyscall shows mmap2, truncate64, and ftruncate64 aren't > +implemented on this system. This matches what checksyscalls.sh shows. > + > +Finding supported features > +========================== > + > +scripts/get_feat.pl can be used to list the Kernel feature support matrix > +for an architecture:: > + > + get_feat.pl list > + get_feat.pl list –arch=arm64 lists Lost the "--" again here > +This scripts parses Documentation/features to find the support status script (singular) > +information. It can be used to validate the contents of the files under > +Documentation/features or simply list them:: > + > + --arch Outputs features for an specific architecture, optionally filtering > + for a single specific feature. > + --feat or --feature Output features for a single specific feature. > + > +Here is how you can find if stackprotector and hread-info-in-task features and *thread*-info-in-task > +are supported:: > + > + scripts/get_feat.pl --arch=arm64 --feat=stackprotector list > + # > + # Kernel feature support matrix of the 'arm64' architecture: > + # > + debug/ stackprotector : ok | HAVE_STACKPROTECTOR # > + arch supports compiler driven stack overflow protection > + > + scripts/get_feat.pl --feat=thread-info-in-task list > + # > + # Kernel feature support matrix of the 'x86' architecture: > + # > + core/ thread-info-in-task : ok | THREAD_INFO_IN_TASK # > + arch makes use of the core kernel facility to embed thread_info in > + task_struct > + > +Finding kernel module status > +============================ > + > +lsmod command shows the kernel modules that are currently loaded. This > +program displays the contents of /proc/modules. Let's pick uvcvideo *The* lsmod *the* uvcvideo > +module which is found on most laptops:: > + > + lsmod | grep uvc > + uvcvideo 126976 0 > + videobuf2_vmalloc 20480 1 uvcvideo > + uvc 16384 1 uvcvideo > + videobuf2_v4l2 36864 1 uvcvideo > + videodev 315392 2 videobuf2_v4l2,uvcvideo > + videobuf2_common 65536 4 videobuf2_vmalloc,videobuf2_v4l2,uvcvideo,videobuf2_memops > + mc 77824 4 videodev,videobuf2_v4l2,uvcvideo,videobuf2_common > + > +You can see that lsmod shows uvcvideo and the modules it depends on and how > +many modules are using them. videobuf2_common is in use by 4 other modules. > +In other words, this is the reference count for this module and rmmod will > +refuse to unload it as long as the reference count is > 0. > + > +You can get the same information from /proc.modules:: > + > + less /proc/modules | grep uvc why not just "grep uvc /proc/modules" ? > + uvcvideo 126976 0 - Live 0x0000000000000000 > + videobuf2_vmalloc 20480 1 uvcvideo, Live 0x0000000000000000 > + uvc 16384 1 uvcvideo, Live 0x0000000000000000 > + videobuf2_v4l2 36864 1 uvcvideo, Live 0x0000000000000000 > + videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0x0000000000000000 > + videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0x0000000000000000 > + mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0x0000000000000000 > + > +The information is similar with a few more extra fields. The address is the > +base address for the module in kernel virtual memory space. When run as a > +normal user, the address is all zeros. The same command when run as root will > +be as follows:: > + > + sudo less /proc/modules | grep uvc > + uvcvideo 126976 0 - Live 0xffffffffc1c8b000 > + videobuf2_vmalloc 20480 1 uvcvideo, Live 0xffffffffc167f000 > + uvc 16384 1 uvcvideo, Live 0xffffffffc0ab0000 > + videobuf2_v4l2 36864 1 uvcvideo, Live 0xffffffffc0a28000 > + videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0xffffffffc16e9000 > + videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0xffffffffc094d000 > + mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0xffffffffc15eb000 > + > +Let's check what modinfo shows that is important for us:: > + > + /sbin/modinfo uvcvideo > + filename: /lib/modules/6.3.0-rc2/kernel/drivers/media/usb/uvc/uvcvideo.ko > + license: GPL > + description: USB Video Class driver > + depends: videobuf2-v4l2,videodev,mc,uvc,videobuf2-common,videobuf2-vmalloc > + retpoline: Y > + intree: Y > + name: uvcvideo > + vermagic: 6.3.0-rc2 SMP preempt mod_unload modversions > + sig_id: PKCS#7 > + signer: Build time autogenerated kernel key > + > +This tells us that this module is built intree and the signed with a build > +time autogenerated key. > + > +Let's do one last sanity check on the system to see if the following two > +command outputs match:: > + > + ps ax | wc -l > + ls -d /proc/* | grep [0-9]|wc -l > + > +If they don't match, examine your system closely. kernel rootkits install > +their own ps, find, etc. utilities to mask their activity. The outputs > +match on my system. Do they on yours? This would assume that there is no other activity on the system, of course. Worth saying to avoid unnecessary panic. > +Is my system as secure as it could be? > +====================================== > + > +Linux kernel supports several hardening options to make system secure. *The* Linux kernel ... to make *the* system secure the whole document could use a pass for article use > +kconfig-hardened-check tool sanity checks kernel configuration for > +security. You can clone the latest kconfig-hardened-check repository:: > + > + git clone https://github.com/a13xp0p0v/kconfig-hardened-check.git > + cd kconfig-hardened-check > + bin/kconfig-hardened-check --config <config file> --cmdline /proc/cmdline Should you say what <config file> is? > +This will generate detailed report of kernel security configuration and > +command line options that are enabled (OK) and the ones that aren't (FAIL) > +and a summary line at the end:: > + > + [+] Config check is finished: 'OK' - 100 / 'FAIL' - 100 > + > +You will have to analyze the information to determine which options make > +sense to enable on your system. > + > +Understanding system run-time activity > +====================================== > + > +Enabling event tracing gives insight into system run-time activity. This is > +a good way to identify which parts of the kernel are used at a higher level > +while system is in and/or while a specific workload/process is running. > + > +Event tracing depends on the CONFIG_EVENT_TRACING option enabled. You can > +enable event tracing before starting workload/process. Event tracing allows > +you to dynamically enable and disable tracing on supported/available events. > +You can find available events, tracers, and filter functions in the following > +files:: > + > + /sys/kernel/debug/tracing/available_events > + /sys/kernel/debug/tracing/available_filter_functions > + /sys/kernel/debug/tracing/available_tracers > + > +Now this is how you can enable tracing:: > + > + sudo echo 1 > /sys/kernel/debug/tracing/events/enable > + > +Once the workload/process stops or when you decide you have the status you > +need, you can disable event tracing:: > + > + sudo echo 0 > /sys/kernel/debug/tracing/events/enable > + > +You can find the tracing information in the file:: > + > + /sys/kernel/debug/tracing > + > +Here is the information shown in this file:: > + > + cat trace > + # tracer: nop > + # > + # entries-in-buffer/entries-written: 0/0 #P:16 > + # > + # _-----=> irqs-off/BH-disabled > + # / _----=> need-resched > + # | / _---=> hardirq/softirq > + # || / _--=> preempt-depth > + # ||| / _-=> migrate-disable > + # |||| / delay > + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION > + # | | | ||||| | | > + That looks like the header, certainly not "the information" found in the file. Including some actual output would make the following discussion more comprehensible. > +Analyzing traces > +================ > + > +You will be able map the functions to system calls and other kernel features > +to get insight into the overall system activity while a workload/process is > +running. > + > +Map the NR (syscal) numbers from the trace to syscalls from the syscalls dump. (syscall) > +Categorize system calls and map them to Linux subsystems. Not sure what that sentence is trying to tell readers. Again, who is the audience; will a readership that needs to be told how to install auditd be able to make sense of this and act on it? > +Conclusion > +========== > + > +This document is intended to be used as a guide on how to gather higher level > +information about a system and its run-time activity. The approach described > +in this document helps us get insight into supported system calls, features, > +assess how secure a system is, and its run-time activity. > + > +References > +========== > + > + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/checksyscalls.sh > + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/get_feat.pl > + * https://github.com/a13xp0p0v/kconfig-hardened-check > + * https://docs.kernel.org/trace/index.html Thanks, jon
On 3/23/23 11:55, Jonathan Corbet wrote: > Shuah Khan <skhan@linuxfoundation.org> writes: > >> Add a new system state document to the admin-guide. This document is >> intended to be used as a guide on how to gather higher level information >> about a system and its run-time activity. >> >> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> >> --- >> Changes since v1: >> -- Addressed review comments >> Thank you for the review and agree with all your comments. I will send v3 shortly. thanks, -- Shuah
Looks good Shuah. On Wed, Mar 22, 2023 at 10:20 AM Shuah Khan <skhan@linuxfoundation.org> wrote: > > Add a new system state document to the admin-guide. This document is > intended to be used as a guide on how to gather higher level information > about a system and its run-time activity. > > Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> > --- > Changes since v1: > -- Addressed review comments > > Documentation/admin-guide/index.rst | 1 + > Documentation/admin-guide/system-state.rst | 350 +++++++++++++++++++++ > 2 files changed, 351 insertions(+) > create mode 100644 Documentation/admin-guide/system-state.rst > > diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst > index f475554382e2..541372672c55 100644 > --- a/Documentation/admin-guide/index.rst > +++ b/Documentation/admin-guide/index.rst > @@ -66,6 +66,7 @@ subsystems expectations will be found here. > :maxdepth: 1 > > workload-tracing > + system-state > > The rest of this manual consists of various unordered guides on how to > configure specific aspects of kernel behavior to your liking. > diff --git a/Documentation/admin-guide/system-state.rst b/Documentation/admin-guide/system-state.rst > new file mode 100644 > index 000000000000..2a6fdf85c35c > --- /dev/null > +++ b/Documentation/admin-guide/system-state.rst > @@ -0,0 +1,350 @@ > +.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0) > + > +=========================================================== > +Discovering system calls and features supported on a system > +=========================================================== > + > +:Author: Shuah Khan <skhan@linuxfoundation.org> > +:maintained-by: Shuah Khan <skhan@linuxfoundation.org> > + > +Key Points > +========== > + > + * System state includes system calls, features, static and dynamic > + modules enabled in the kernel configuration. > + * Supported system calls and Kernel features are architecture dependent. > + * auditd, checksyscalls.sh, and get_feat.pl tools can be used to discover > + static system state. > + * Understanding Linux kernel hardening configurations options and making > + sure they are enabled will make a system more secure. > + * Employing run-time tracing can shed light on the dynamic system state. > + * Workloads could change the system state by loading and unloading dynamic > + modules and tuning system parameters. > + > +System State Visualization > +========================== > + > +The kernel system state can be viewed as a combination of static and > +dynamic features and modules. Let’s first define what static and dynamic > +system states are and then explore how we can visualize the static and > +dynamic system parts of the kernel. > + > +Static System View comprises system calls, features, static and dynamic > +modules enabled in the kernel configuration. Supported system calls > +and Kernel features are architecture dependent. System call numbering is > +different on different architectures. We can get the supported system call > +information using auditd utilities. > + > +ausyscall –dump prints out the supported system calls on a system and allows > +mapping syscall names and numbers. You can install the auditd package on > +Debian based systems:: > + > + sudo apt-get install auditd > + > +scripts/checksyscalls.sh can be used to check if current architecture is > +missing any system calls compared to i386. > + > +scripts/get_feat.pl can be used to list the Kernel feature support matrix > +for an architecture. > + > +Dynamic System View comprises system calls, ioctls invoked, and subsystems > +used during the runtime. A workload could load and unload modules and also > +change the dynamic system configuration to suit its needs by tuning system > +parameters. > + > +What is the methodology? > +======================== > + > +The first step is gathering the default system state such as the dynamic > +and static modules loaded on the system. lsmod command prints out the > +dynamically loaded modules on a system. Statically configured modules can > +be found in the kernel configuration file. > + > +The next step is discovering system activity during run-time. You can do so > +by enabling event tracing and then running your favorite application. After > +a period of time, gather the event logs, and kernel messages. > + > +Once you have the necessary information, you can extract the system call > +numbers from the event trace log and map them to the supported system calls. > + > +Finding supported system calls > +============================== > + > +As mentioned earlier, ausyscall prints out supported system calls > +on a system and allows mapping syscalls names and numbers:: > + > + ausyscall --dump > + > +You can look for specific system calls as shown in the below:: > + > + ausyscall open > + open 2 > + mq_open 240 > + openat 257 > + perf_event_open 298 > + open_by_handle_at 304 > + open_tree 428 > + fsopen 430 > + pidfd_open 434 > + openat2 437 > + > + ausyscall time > + > + getitimer 36 > + setitimer 38 > + gettimeofday 96 > + times 100 > + rt_sigtimedwait 128 > + utime 132 > + adjtimex 159 > + settimeofday 164 > + time 201 > + semtimedop 220 > + timer_create 222 > + timer_settime 223 > + timer_gettime 224 > + timer_getoverrun 225 > + timer_delete 226 > + clock_settime 227 > + clock_gettime 228 > + utimes 235 > + mq_timedsend 242 > + mq_timedreceive 243 > + futimesat 261 > + utimensat 280 > + timerfd_create 283 > + timerfd_settime 286 > + timerfd_gettime 287 > + clock_adjtime 305 > + > +Finding unsupported system calls > +================================ > + > +As mentioned earlier, scripts/checksyscalls.sh checks missing system calls > +on current architecture compared to i386. Example run:: > + > + checksyscalls.sh gcc > + warning: #warning syscall mmap2 not implemented [-Wcpp] > + warning: #warning syscall truncate64 not implemented [-Wcpp] > + warning: #warning syscall ftruncate64 not implemented [-Wcpp] > + warning: #warning syscall fcntl64 not implemented [-Wcpp] > + warning: #warning syscall sendfile64 not implemented [-Wcpp] > + warning: #warning syscall statfs64 not implemented [-Wcpp] > + warning: #warning syscall fstatfs64 not implemented [-Wcpp] > + warning: #warning syscall fadvise64_64 not implemented [-Wcpp] > + > +Let's check this against ausyscall now:: > + > + ausyscall map > + mmap 9 > + munmap 11 > + mremap 25 > + remap_file_pages 216 > + > + ausyscall trunc > + truncate 76 > + ftruncate 77 > + > +As you can see, ausyscall shows mmap2, truncate64, and ftruncate64 aren't > +implemented on this system. This matches what checksyscalls.sh shows. > + > +Finding supported features > +========================== > + > +scripts/get_feat.pl can be used to list the Kernel feature support matrix > +for an architecture:: > + > + get_feat.pl list > + get_feat.pl list –arch=arm64 lists > + > +This scripts parses Documentation/features to find the support status > +information. It can be used to validate the contents of the files under > +Documentation/features or simply list them:: > + > + --arch Outputs features for an specific architecture, optionally filtering > + for a single specific feature. > + --feat or --feature Output features for a single specific feature. > + > +Here is how you can find if stackprotector and hread-info-in-task features > +are supported:: > + > + scripts/get_feat.pl --arch=arm64 --feat=stackprotector list > + # > + # Kernel feature support matrix of the 'arm64' architecture: > + # > + debug/ stackprotector : ok | HAVE_STACKPROTECTOR # > + arch supports compiler driven stack overflow protection > + > + scripts/get_feat.pl --feat=thread-info-in-task list > + # > + # Kernel feature support matrix of the 'x86' architecture: > + # > + core/ thread-info-in-task : ok | THREAD_INFO_IN_TASK # > + arch makes use of the core kernel facility to embed thread_info in > + task_struct > + > +Finding kernel module status > +============================ > + > +lsmod command shows the kernel modules that are currently loaded. This > +program displays the contents of /proc/modules. Let's pick uvcvideo > +module which is found on most laptops:: > + > + lsmod | grep uvc > + uvcvideo 126976 0 > + videobuf2_vmalloc 20480 1 uvcvideo > + uvc 16384 1 uvcvideo > + videobuf2_v4l2 36864 1 uvcvideo > + videodev 315392 2 videobuf2_v4l2,uvcvideo > + videobuf2_common 65536 4 videobuf2_vmalloc,videobuf2_v4l2,uvcvideo,videobuf2_memops > + mc 77824 4 videodev,videobuf2_v4l2,uvcvideo,videobuf2_common > + > +You can see that lsmod shows uvcvideo and the modules it depends on and how > +many modules are using them. videobuf2_common is in use by 4 other modules. > +In other words, this is the reference count for this module and rmmod will > +refuse to unload it as long as the reference count is > 0. > + > +You can get the same information from /proc.modules:: > + > + less /proc/modules | grep uvc > + uvcvideo 126976 0 - Live 0x0000000000000000 > + videobuf2_vmalloc 20480 1 uvcvideo, Live 0x0000000000000000 > + uvc 16384 1 uvcvideo, Live 0x0000000000000000 > + videobuf2_v4l2 36864 1 uvcvideo, Live 0x0000000000000000 > + videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0x0000000000000000 > + videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0x0000000000000000 > + mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0x0000000000000000 > + > +The information is similar with a few more extra fields. The address is the > +base address for the module in kernel virtual memory space. When run as a > +normal user, the address is all zeros. The same command when run as root will > +be as follows:: > + > + sudo less /proc/modules | grep uvc > + uvcvideo 126976 0 - Live 0xffffffffc1c8b000 > + videobuf2_vmalloc 20480 1 uvcvideo, Live 0xffffffffc167f000 > + uvc 16384 1 uvcvideo, Live 0xffffffffc0ab0000 > + videobuf2_v4l2 36864 1 uvcvideo, Live 0xffffffffc0a28000 > + videodev 315392 2 uvcvideo,videobuf2_v4l2, Live 0xffffffffc16e9000 > + videobuf2_common 65536 4 uvcvideo,videobuf2_vmalloc,videobuf2_memops,videobuf2_v4l2, Live 0xffffffffc094d000 > + mc 77824 4 uvcvideo,videobuf2_v4l2,videodev,videobuf2_common, Live 0xffffffffc15eb000 > + > +Let's check what modinfo shows that is important for us:: > + > + /sbin/modinfo uvcvideo > + filename: /lib/modules/6.3.0-rc2/kernel/drivers/media/usb/uvc/uvcvideo.ko > + license: GPL > + description: USB Video Class driver > + depends: videobuf2-v4l2,videodev,mc,uvc,videobuf2-common,videobuf2-vmalloc > + retpoline: Y > + intree: Y > + name: uvcvideo > + vermagic: 6.3.0-rc2 SMP preempt mod_unload modversions > + sig_id: PKCS#7 > + signer: Build time autogenerated kernel key > + > +This tells us that this module is built intree and the signed with a build > +time autogenerated key. > + > +Let's do one last sanity check on the system to see if the following two > +command outputs match:: > + > + ps ax | wc -l > + ls -d /proc/* | grep [0-9]|wc -l > + > +If they don't match, examine your system closely. kernel rootkits install > +their own ps, find, etc. utilities to mask their activity. The outputs > +match on my system. Do they on yours? > + > +Is my system as secure as it could be? > +====================================== > + > +Linux kernel supports several hardening options to make system secure. > +kconfig-hardened-check tool sanity checks kernel configuration for > +security. You can clone the latest kconfig-hardened-check repository:: > + > + git clone https://github.com/a13xp0p0v/kconfig-hardened-check.git > + cd kconfig-hardened-check > + bin/kconfig-hardened-check --config <config file> --cmdline /proc/cmdline > + > +This will generate detailed report of kernel security configuration and > +command line options that are enabled (OK) and the ones that aren't (FAIL) > +and a summary line at the end:: > + > + [+] Config check is finished: 'OK' - 100 / 'FAIL' - 100 > + > +You will have to analyze the information to determine which options make > +sense to enable on your system. > + > +Understanding system run-time activity > +====================================== > + > +Enabling event tracing gives insight into system run-time activity. This is > +a good way to identify which parts of the kernel are used at a higher level > +while system is in and/or while a specific workload/process is running. > + > +Event tracing depends on the CONFIG_EVENT_TRACING option enabled. You can > +enable event tracing before starting workload/process. Event tracing allows > +you to dynamically enable and disable tracing on supported/available events. > +You can find available events, tracers, and filter functions in the following > +files:: > + > + /sys/kernel/debug/tracing/available_events > + /sys/kernel/debug/tracing/available_filter_functions > + /sys/kernel/debug/tracing/available_tracers > + > +Now this is how you can enable tracing:: > + > + sudo echo 1 > /sys/kernel/debug/tracing/events/enable > + > +Once the workload/process stops or when you decide you have the status you > +need, you can disable event tracing:: > + > + sudo echo 0 > /sys/kernel/debug/tracing/events/enable > + > +You can find the tracing information in the file:: > + > + /sys/kernel/debug/tracing > + > +Here is the information shown in this file:: > + > + cat trace > + # tracer: nop > + # > + # entries-in-buffer/entries-written: 0/0 #P:16 > + # > + # _-----=> irqs-off/BH-disabled > + # / _----=> need-resched > + # | / _---=> hardirq/softirq > + # || / _--=> preempt-depth > + # ||| / _-=> migrate-disable > + # |||| / delay > + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION > + # | | | ||||| | | > + > + > +Analyzing traces > +================ > + > +You will be able map the functions to system calls and other kernel features > +to get insight into the overall system activity while a workload/process is > +running. > + > +Map the NR (syscal) numbers from the trace to syscalls from the syscalls dump. > +Categorize system calls and map them to Linux subsystems. > + > +Conclusion > +========== > + > +This document is intended to be used as a guide on how to gather higher level > +information about a system and its run-time activity. The approach described > +in this document helps us get insight into supported system calls, features, > +assess how secure a system is, and its run-time activity. > + > +References > +========== > + > + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/checksyscalls.sh > + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/scripts/get_feat.pl > + * https://github.com/a13xp0p0v/kconfig-hardened-check > + * https://docs.kernel.org/trace/index.html > -- > 2.34.1 >
© 2016 - 2026 Red Hat, Inc.