This patch proposes initial kernel-doc documentation for memory_open()
and most of the functions in the mem_fops structure.
The format used for the specifications follows the guidelines
defined in Documentation/doc-guide/code-specifications.rst
Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com>
---
drivers/char/mem.c | 231 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 225 insertions(+), 6 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 48839958b0b1..e69c164e9465 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -75,9 +75,54 @@ static inline bool should_stop_iteration(void)
return signal_pending(current);
}
-/*
- * This funcion reads the *physical* memory. The f_pos points directly to the
- * memory location.
+/**
+ * read_mem - read from physical memory (/dev/mem).
+ * @file: struct file associated with /dev/mem.
+ * @buf: user-space buffer to copy data to.
+ * @count: number of bytes to read.
+ * @ppos: pointer to the current file position, representing the physical
+ * address to read from.
+ *
+ * This function checks if the requested physical memory range is valid
+ * and accessible by the user, then it copies data to the input
+ * user-space buffer up to the requested number of bytes.
+ *
+ * Function's expectations:
+ *
+ * 1. This function shall check if the value pointed by ppos exceeds the
+ * maximum addressable physical address;
+ *
+ * 2. This function shall check if the physical address range to be read
+ * is valid (i.e. it falls within a memory block and if it can be mapped
+ * to the kernel address space);
+ *
+ * 3. For each memory page falling in the requested physical range
+ * [ppos, ppos + count - 1]:
+ * 3.1. this function shall check if user space access is allowed (if
+ * config STRICT_DEVMEM is not set, access is always granted);
+ *
+ * 3.2. if access is allowed, the memory content from the page range falling
+ * within the requested physical range shall be copied to the user space
+ * buffer;
+ *
+ * 3.3. zeros shall be copied to the user space buffer (for the page range
+ * falling within the requested physical range):
+ * 3.3.1. if access to the memory page is restricted or,
+ * 3.2.2. if the current page is page 0 on HW architectures where page 0 is
+ * not mapped.
+ *
+ * 4. The file position '*ppos' shall be advanced by the number of bytes
+ * successfully copied to user space (including zeros).
+ *
+ * Context: process context.
+ *
+ * Return:
+ * * the number of bytes copied to user on success
+ * * %-EFAULT - the requested address range is not valid or a fault happened
+ * when copying to user-space (i.e. copy_from_kernel_nofault() failed)
+ * * %-EPERM - access to any of the required physical pages is not allowed
+ * * %-ENOMEM - out of memory error for auxiliary kernel buffers supporting
+ * the operation of copying content from the physical pages
*/
static ssize_t read_mem(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
@@ -166,6 +211,54 @@ static ssize_t read_mem(struct file *file, char __user *buf,
return err;
}
+/**
+ * write_mem - write to physical memory (/dev/mem).
+ * @file: struct file associated with /dev/mem.
+ * @buf: user-space buffer containing the data to write.
+ * @count: number of bytes to write.
+ * @ppos: pointer to the current file position, representing the physical
+ * address to write to.
+ *
+ * This function checks if the target physical memory range is valid
+ * and accessible by the user, then it writes data from the input
+ * user-space buffer up to the requested number of bytes.
+ *
+ * Function's expectations:
+ * 1. This function shall check if the value pointed by ppos exceeds the
+ * maximum addressable physical address;
+ *
+ * 2. This function shall check if the physical address range to be written
+ * is valid (i.e. it falls within a memory block and if it can be mapped
+ * to the kernel address space);
+ *
+ * 3. For each memory page falling in the physical range to be written
+ * [ppos, ppos + count - 1]:
+ * 3.1. this function shall check if user space access is allowed (if
+ * config STRICT_DEVMEM is not set, access is always granted);
+ *
+ * 3.2. the content from the user space buffer shall be copied to the page
+ * range falling within the physical range to be written if access is
+ * allowed;
+ *
+ * 3.3. the data to be copied from the user space buffer (for the page range
+ * falling within the range to be written) shall be skipped:
+ * 3.3.1. if access to the memory page is restricted or,
+ * 3.3.2. if the current page is page 0 on HW architectures where page 0
+ * is not mapped.
+ *
+ * 4. The file position '*ppos' shall be advanced by the number of bytes
+ * successfully copied from user space (including skipped bytes).
+ *
+ * Context: process context.
+ *
+ * Return:
+ * * the number of bytes copied from user-space on success
+ * * %-EFBIG - the value pointed by ppos exceeds the maximum addressable
+ * physical address
+ * * %-EFAULT - the physical address range is not valid or no bytes could
+ * be copied from user-space
+ * * %-EPERM - access to any of the required pages is not allowed
+ */
static ssize_t write_mem(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
@@ -322,6 +415,42 @@ static const struct vm_operations_struct mmap_mem_ops = {
#endif
};
+/**
+ * mmap_mem - map physical memory into user space (/dev/mem).
+ * @file: file structure for the device.
+ * @vma: virtual memory area structure describing the user mapping.
+ *
+ * This function checks if the requested physical memory range is valid
+ * and accessible by the user, then it maps the physical memory range to
+ * user-mode address space.
+ *
+ * Function's expectations:
+ * 1. This function shall check if the requested physical address range to be
+ * mapped fits within the maximum addressable physical range;
+ *
+ * 2. This function shall check if the requested physical range corresponds to
+ * a valid physical range and if access is allowed on it (if config STRICT_DEVMEM
+ * is not set, access is always allowed);
+ *
+ * 3. This function shall check if the input virtual memory area can be used for
+ * a private mapping (always OK if there is an MMU);
+ *
+ * 4. This function shall set the virtual memory area operations to
+ * &mmap_mem_ops;
+ *
+ * 5. This function shall establish a mapping between the user-space
+ * virtual memory area described by vma and the physical memory
+ * range specified by vma->vm_pgoff and size;
+ *
+ * Context: process context.
+ *
+ * Return:
+ * * 0 on success
+ * * %-EAGAIN - invalid or unsupported mapping requested (remap_pfn_range()
+ * fails)
+ * * %-EINVAL - requested physical range to be mapped is not valid
+ * * %-EPERM - no permission to access the requested physical range
+ */
static int mmap_mem(struct file *file, struct vm_area_struct *vma)
{
size_t size = vma->vm_end - vma->vm_start;
@@ -550,13 +679,47 @@ static loff_t null_lseek(struct file *file, loff_t offset, int orig)
return file->f_pos = 0;
}
-/*
+/**
+ * memory_lseek - change the file position.
+ * @file: file structure for the device.
+ * @offset: file offset to seek to.
+ * @orig: where to start seeking from (see whence in the llseek manpage).
+ *
+ * This function changes the file position according to the input offset
+ * and orig parameters.
+ *
+ * Function's expectations:
+ * 1. This function shall lock the semaphore of the inode corresponding to the
+ * input file before any operation and unlock it before returning.
+ *
+ * 2. This function shall check the orig value and accordingly:
+ * 2.1. if it is equal to SEEK_CUR, the current file position shall be
+ * incremented by the input offset;
+ * 2.2. if it is equal to SEEK_SET, the current file position shall be
+ * set to the input offset value;
+ * 2.3. any other value shall result in an error condition.
+ *
+ * 3. Before writing the current file position, the new position value
+ * shall be checked to not overlap with Linux ERRNO values.
+ *
+ * Assumptions of Use:
+ * 1. the input file pointer is expected to be valid.
+ *
+ * Notes:
* The memory devices use the full 32/64 bits of the offset, and so we cannot
* check against negative addresses: they are ok. The return value is weird,
* though, in that case (0).
*
- * also note that seeking relative to the "end of file" isn't supported:
- * it has no meaning, so it returns -EINVAL.
+ * Also note that seeking relative to the "end of file" isn't supported:
+ * it has no meaning, so passing orig equal to SEEK_END returns -EINVAL.
+ *
+ * Context: process context, locks/unlocks inode->i_rwsem
+ *
+ * Return:
+ * * the new file position on success
+ * * %-EOVERFLOW - the new position value equals or exceeds
+ * (unsigned long long) -MAX_ERRNO
+ * * %-EINVAL - the orig parameter is invalid
*/
static loff_t memory_lseek(struct file *file, loff_t offset, int orig)
{
@@ -584,6 +747,35 @@ static loff_t memory_lseek(struct file *file, loff_t offset, int orig)
return ret;
}
+/**
+ * open_port - open the I/O port device (/dev/port).
+ * @inode: inode of the device file.
+ * @filp: file structure for the device.
+ *
+ * This function checks if the caller can access the port device and sets
+ * the f_mapping pointer of filp to the i_mapping pointer of inode.
+ *
+ * Function's expectations:
+ * 1. This function shall check if the caller has sufficient capabilities to
+ * perform raw I/O access;
+ *
+ * 2. This function shall check if the kernel is locked down with the
+ * &LOCKDOWN_DEV_MEM restriction;
+ *
+ * 3. If the input inode corresponds to /dev/mem, the f_mapping pointer
+ * of the input file structure shall be set to the i_mapping pointer
+ * of the input inode;
+ *
+ * Assumptions of Use:
+ * 1. The input inode and filp are expected to be valid.
+ *
+ * Context: process context.
+ *
+ * Return:
+ * * 0 on success
+ * * %-EPERM - caller lacks the required capability (CAP_SYS_RAWIO)
+ * * any error returned by securty_locked_down()
+ */
static int open_port(struct inode *inode, struct file *filp)
{
int rc;
@@ -691,6 +883,33 @@ static const struct memdev {
#endif
};
+/**
+ * memory_open - set the filp f_op to the memory device fops and invoke open().
+ * @inode: inode of the device file.
+ * @filp: file structure for the device.
+ *
+ * Function's expectations:
+ * 1. This function shall retrieve the minor number associated with the input
+ * inode and the memory device corresponding to such minor number;
+ *
+ * 2. The file operations pointer shall be set to the memory device file operations;
+ *
+ * 3. The file mode member of the input filp shall be OR'd with the device mode;
+ *
+ * 4. The memory device open() file operation shall be invoked.
+ *
+ * Assumptions of Use:
+ * 1. The input inode and filp are expected to be non-NULL.
+ *
+ * Context: process context.
+ *
+ * Return:
+ * * 0 on success
+ * * %-ENXIO - the minor number corresponding to the input inode cannot be
+ * associated with any device or the corresponding device has a NULL fops
+ * pointer
+ * * any error returned by the device specific open function pointer
+ */
static int memory_open(struct inode *inode, struct file *filp)
{
int minor;
--
2.48.1
Gabriele Paoloni <gpaoloni@redhat.com> writes: > This patch proposes initial kernel-doc documentation for memory_open() > and most of the functions in the mem_fops structure. > The format used for the specifications follows the guidelines > defined in Documentation/doc-guide/code-specifications.rst I'll repeat my obnoxious question from the first patch: what does that buy for us? > Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com> > --- > drivers/char/mem.c | 231 +++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 225 insertions(+), 6 deletions(-) > > diff --git a/drivers/char/mem.c b/drivers/char/mem.c > index 48839958b0b1..e69c164e9465 100644 > --- a/drivers/char/mem.c > +++ b/drivers/char/mem.c > @@ -75,9 +75,54 @@ static inline bool should_stop_iteration(void) > return signal_pending(current); > } > > -/* > - * This funcion reads the *physical* memory. The f_pos points directly to the > - * memory location. > +/** > + * read_mem - read from physical memory (/dev/mem). > + * @file: struct file associated with /dev/mem. > + * @buf: user-space buffer to copy data to. > + * @count: number of bytes to read. > + * @ppos: pointer to the current file position, representing the physical > + * address to read from. > + * > + * This function checks if the requested physical memory range is valid > + * and accessible by the user, then it copies data to the input > + * user-space buffer up to the requested number of bytes. > + * > + * Function's expectations: > + * > + * 1. This function shall check if the value pointed by ppos exceeds the > + * maximum addressable physical address; > + * > + * 2. This function shall check if the physical address range to be read > + * is valid (i.e. it falls within a memory block and if it can be mapped > + * to the kernel address space); > + * > + * 3. For each memory page falling in the requested physical range > + * [ppos, ppos + count - 1]: > + * 3.1. this function shall check if user space access is allowed (if > + * config STRICT_DEVMEM is not set, access is always granted); > + * > + * 3.2. if access is allowed, the memory content from the page range falling > + * within the requested physical range shall be copied to the user space > + * buffer; > + * > + * 3.3. zeros shall be copied to the user space buffer (for the page range > + * falling within the requested physical range): > + * 3.3.1. if access to the memory page is restricted or, > + * 3.2.2. if the current page is page 0 on HW architectures where page 0 is > + * not mapped. > + * > + * 4. The file position '*ppos' shall be advanced by the number of bytes > + * successfully copied to user space (including zeros). My kneejerk first reaction is: you are repeating the code of the function in a different language. If we are not convinced that the code is correct, how can we be more confident that this set of specifications is correct? And again, what will consume this text? How does going through this effort get us to a better kernel? Despite having been to a couple of your talks, I'm not fully understanding how this comes together; people who haven't been to the talks are not going to have an easier time getting the full picture. Thanks, jon
Hi Jonathan On Tue, Sep 16, 2025 at 12:39 AM Jonathan Corbet <corbet@lwn.net> wrote: > > Gabriele Paoloni <gpaoloni@redhat.com> writes: > > > This patch proposes initial kernel-doc documentation for memory_open() > > and most of the functions in the mem_fops structure. > > The format used for the specifications follows the guidelines > > defined in Documentation/doc-guide/code-specifications.rst > > I'll repeat my obnoxious question from the first patch: what does that > buy for us? I tried to explain my reply on patch 1 > > > Signed-off-by: Gabriele Paoloni <gpaoloni@redhat.com> > > --- > > drivers/char/mem.c | 231 +++++++++++++++++++++++++++++++++++++++++++-- > > 1 file changed, 225 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/char/mem.c b/drivers/char/mem.c > > index 48839958b0b1..e69c164e9465 100644 > > --- a/drivers/char/mem.c > > +++ b/drivers/char/mem.c > > @@ -75,9 +75,54 @@ static inline bool should_stop_iteration(void) > > return signal_pending(current); > > } > > > > -/* > > - * This funcion reads the *physical* memory. The f_pos points directly to the > > - * memory location. > > +/** > > + * read_mem - read from physical memory (/dev/mem). > > + * @file: struct file associated with /dev/mem. > > + * @buf: user-space buffer to copy data to. > > + * @count: number of bytes to read. > > + * @ppos: pointer to the current file position, representing the physical > > + * address to read from. > > + * > > + * This function checks if the requested physical memory range is valid > > + * and accessible by the user, then it copies data to the input > > + * user-space buffer up to the requested number of bytes. > > + * > > + * Function's expectations: > > + * > > + * 1. This function shall check if the value pointed by ppos exceeds the > > + * maximum addressable physical address; > > + * > > + * 2. This function shall check if the physical address range to be read > > + * is valid (i.e. it falls within a memory block and if it can be mapped > > + * to the kernel address space); > > + * > > + * 3. For each memory page falling in the requested physical range > > + * [ppos, ppos + count - 1]: > > + * 3.1. this function shall check if user space access is allowed (if > > + * config STRICT_DEVMEM is not set, access is always granted); > > + * > > + * 3.2. if access is allowed, the memory content from the page range falling > > + * within the requested physical range shall be copied to the user space > > + * buffer; > > + * > > + * 3.3. zeros shall be copied to the user space buffer (for the page range > > + * falling within the requested physical range): > > + * 3.3.1. if access to the memory page is restricted or, > > + * 3.2.2. if the current page is page 0 on HW architectures where page 0 is > > + * not mapped. > > + * > > + * 4. The file position '*ppos' shall be advanced by the number of bytes > > + * successfully copied to user space (including zeros). > > My kneejerk first reaction is: you are repeating the code of the > function in a different language. If we are not convinced that the code > is correct, how can we be more confident that this set of specifications > is correct? And again, what will consume this text? How does going > through this effort get us to a better kernel? In summary specifications provide the criteria to be used in verifying the code (both when reviewing and testing). Otherwise: 1) Developers and reviewers have no criteria to evaluate the code, other than their expertise and judgement when they read it; 2) Testers would write test cases based on the code itself (so it is more likely that a wrong code is not detected due to wrong test cases). WRT your first point, if specifications are wrong, a reviewer or a test would detect a gap between code and associated specs, hence leading to process of scrutiny of both code and specs where such a gap must be resolved. This is the reason why the duality of specification and tests VS the code being verified lead to confidence in code becoming more dependable (also from a user point of view that can now clearly see the assumptions to be met when invoking the code) Thanks Gab > > Despite having been to a couple of your talks, I'm not fully > understanding how this comes together; people who haven't been to the > talks are not going to have an easier time getting the full picture. > > Thanks, > > jon >
On Mon Sep 15, 2025 at 10:39 PM UTC, Jonathan Corbet wrote: > Gabriele Paoloni <gpaoloni@redhat.com> writes: > >> This patch proposes initial kernel-doc documentation for memory_open() and >> most of the functions in the mem_fops structure. The format used for the >> specifications follows the guidelines defined in >> Documentation/doc-guide/code-specifications.rst > > I'll repeat my obnoxious question from the first patch: what does that buy > for us? Fair question, and definitely not obnoxious. It might help to reframe this a bit. The idea is to take an engineering technique from one domain and apply it with modifications to another. The relevant terms of art are "forward engineering" and "reverse engineering". > My kneejerk first reaction is: you are repeating the code of the function in > a different language. No disagreement on that perception. We have more work to do when it comes to communicating the idea, as well as developing a better implementation. The design of the Linux kernel is emergent and, in the present state, all forms of testing are an (educated) guess at the intended design. We can demonstrate this by picking a random bit of code from the kernel and assigning ourselves the task of writing a test for it. Are you certain that your test accurately reflects the true design intent? You can read the code and test what you see. But that does not mean that your test is valid against the intent in someone else's head. Music instructors see this whenever their students play the right notes but clearly do not yet "feel" the music. The difference is noticeable even by casual listeners. > If we are not convinced that the code is correct, how can we be more > confident that this set of specifications is correct? We have no reason to be independently convinced of either. When we describe this as importing a technique into a new domain, your question is an example of some of the concessions that have to be made. The Linux kernel is not a forward engineered system. Therefore it is not possible to develop code and test from the same seed. Our only option is to reverse engineer that seed to the best of our abilities. At that point we have a few options. Ideally, the original developer can weigh in and validate that our interpretation is correct. This has the effect of "simulating" a forward engineering scenario, because a test can be created from the validated seed (I am trying valiantly to avoid using the word kernel). Absent the original developer's validation, we have the option of simply asserting the specification. This is equivalent to the way testing is done today, except a test can be equally opaque with respect to what design it is attempting to validate. In either case, if a test is developed against the specification, even an initially incorrect specification, we have the ability to bring code, specification, and test into alignment over time. > And again, what will consume this text? Humans are the consumer. But to be clear - a machine readable template is going to be required in the long run to ensure that code and specification remain aligned. Our intentent was to avoid confusing things with templates, and introduce them once we have made headway on the points you have brought up. It is probably also worth mentioning, we have already had an "a-ha" moment from one kernel maintainer. I believe the words were something to the effect of, "this is great, I used to have to relearn that code every time I touch it". > How does going through this effort get us to a better kernel? I am hoping some of the above planted the seed to answer this one. Code must be correct in two ways, it must be valid and it must be verified. Valid means - the code is doing the right thing. Verified means - the code is doing the thing right. If code and test accurately reflect the same idea, then we can alleviate maintainers of a large portion of the verification burden. Validation is in the "hearts and minds" of the users, so that burden never goes away. > Despite having been to a couple of your talks, I'm not fully understanding > how this comes together; people who haven't been to the talks are not going > to have an easier time getting the full picture. I agree. And thank you very much for attending those talks and engaging with us. It truly means a lot. I have submitted a refereed talk to this year's Pumbers conference that is intended to go over these points in detail. My colleague (not on this thread) has also submitted a refereed talk on best practices for developing these specifications. His name is Matthew Whitehead and he is a recognized expert in that area. ..Ch:W..
© 2016 - 2026 Red Hat, Inc.