.../admin-guide/kernel-parameters.txt | 6 +- Documentation/core-api/dma-attributes.rst | 10 + drivers/base/core.c | 4 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- drivers/gpu/drm/drm_prime.c | 2 +- include/linux/device.h | 12 + include/linux/dma-mapping.h | 6 + include/linux/swiotlb.h | 54 ++- kernel/dma/swiotlb.c | 382 ++++++++++++++++-- 9 files changed, 443 insertions(+), 35 deletions(-)
From: Petr Tesarik <petr.tesarik.ext@huawei.com> The goal of my work is to provide more flexibility in the sizing of SWIOTLB. The software IO TLB was designed with these assumptions: 1. It would not be used much, especially on 64-bit systems. 2. A small fixed memory area (64 MiB by default) is sufficient to handle the few cases which require a bounce buffer. 3. 64 MiB is little enough that it has no impact on the rest of the system. First, if SEV is active, all DMA must be done through shared unencrypted pages, and SWIOTLB is used to make this happen without changing device drivers. The software IO TLB size is increased to 6% of total memory in sev_setup_arch(), but that is more of an approximation. The actual requirements may vary depending on the amount of I/O and which drivers are used. These factors may not be know at boot time, i.e. when SWIOTLB is allocated. Second, other colleagues have noticed that they can reliably get rid of occasional OOM kills on an Arm embedded device by reducing the SWIOTLB size. This can be achieved with a kernel parameter, but determining the right value puts additional burden on pre-release testing, which could be avoided if SWIOTLB is allocated small and grows only when necessary. Changes from v1-devel-v7: - Add comments to acquire/release barriers - Fix whitespace issues reported by checkpatch.pl Changes from v1-devel-v6: - Provide long description of functions - Fix kernel-doc (Returns: to Return:) - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() Changes from RFC: - Track dynamic buffers per device instead of per swiotlb - Use a linked list instead of a maple tree - Move initialization of swiotlb fields of struct device to a helper function - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() - Introduce per-device flag if dynamic buffers are in use - Add one more user of DMA_ATTR_MAY_SLEEP - Add kernel-doc comments for new (and some old) code - Properly escape '*' in dma-attributes.rst Petr Tesarik (7): swiotlb: Use a helper to initialize swiotlb fields in struct device swiotlb: Move code around in preparation for dynamic bounce buffers dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute swiotlb: Dynamically allocated bounce buffers swiotlb: Add a boot option to enable dynamic bounce buffers drm: Use DMA_ATTR_MAY_SLEEP from process context swiotlb: per-device flag if there are dynamically allocated buffers .../admin-guide/kernel-parameters.txt | 6 +- Documentation/core-api/dma-attributes.rst | 10 + drivers/base/core.c | 4 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- drivers/gpu/drm/drm_prime.c | 2 +- include/linux/device.h | 12 + include/linux/dma-mapping.h | 6 + include/linux/swiotlb.h | 54 ++- kernel/dma/swiotlb.c | 382 ++++++++++++++++-- 9 files changed, 443 insertions(+), 35 deletions(-) -- 2.25.1
Hi, On Wed, 19 Apr 2023 12:03:52 +0200 Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > The goal of my work is to provide more flexibility in the sizing of > SWIOTLB. > > The software IO TLB was designed with these assumptions: > > 1. It would not be used much, especially on 64-bit systems. > 2. A small fixed memory area (64 MiB by default) is sufficient to > handle the few cases which require a bounce buffer. > 3. 64 MiB is little enough that it has no impact on the rest of the > system. > > First, if SEV is active, all DMA must be done through shared > unencrypted pages, and SWIOTLB is used to make this happen without > changing device drivers. The software IO TLB size is increased to > 6% of total memory in sev_setup_arch(), but that is more of an > approximation. The actual requirements may vary depending on the > amount of I/O and which drivers are used. These factors may not be > know at boot time, i.e. when SWIOTLB is allocated. > > Second, other colleagues have noticed that they can reliably get > rid of occasional OOM kills on an Arm embedded device by reducing > the SWIOTLB size. This can be achieved with a kernel parameter, but > determining the right value puts additional burden on pre-release > testing, which could be avoided if SWIOTLB is allocated small and > grows only when necessary. Now that merging into 6.4 has begun, what about this patch series? I'm eager to get some feedback (positive or negative) and respin the next version. Petr T
On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > Hi, > > On Wed, 19 Apr 2023 12:03:52 +0200 > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > The goal of my work is to provide more flexibility in the sizing of > > SWIOTLB. > > > > The software IO TLB was designed with these assumptions: > > > > 1. It would not be used much, especially on 64-bit systems. > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > handle the few cases which require a bounce buffer. > > 3. 64 MiB is little enough that it has no impact on the rest of the > > system. > > > > First, if SEV is active, all DMA must be done through shared > > unencrypted pages, and SWIOTLB is used to make this happen without > > changing device drivers. The software IO TLB size is increased to > > 6% of total memory in sev_setup_arch(), but that is more of an > > approximation. The actual requirements may vary depending on the > > amount of I/O and which drivers are used. These factors may not be > > know at boot time, i.e. when SWIOTLB is allocated. > > > > Second, other colleagues have noticed that they can reliably get > > rid of occasional OOM kills on an Arm embedded device by reducing > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > determining the right value puts additional burden on pre-release > > testing, which could be avoided if SWIOTLB is allocated small and > > grows only when necessary. > > Now that merging into 6.4 has begun, what about this patch series? I'm > eager to get some feedback (positive or negative) and respin the next > version. It's the merge window, we can't add new things that haven't been in linux-next already. Please resubmit it after -rc1 is out. thanks, greg k-h
Hi Greg, On Wed, 26 Apr 2023 14:26:36 +0200 Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > Hi, > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > SWIOTLB. > > > > > > The software IO TLB was designed with these assumptions: > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > handle the few cases which require a bounce buffer. > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > system. > > > > > > First, if SEV is active, all DMA must be done through shared > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > changing device drivers. The software IO TLB size is increased to > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > approximation. The actual requirements may vary depending on the > > > amount of I/O and which drivers are used. These factors may not be > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > Second, other colleagues have noticed that they can reliably get > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > determining the right value puts additional burden on pre-release > > > testing, which could be avoided if SWIOTLB is allocated small and > > > grows only when necessary. > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > eager to get some feedback (positive or negative) and respin the next > > version. > > It's the merge window, we can't add new things that haven't been in > linux-next already. This is understood. I'm not asking for immediate inclusion. > Please resubmit it after -rc1 is out. If you can believe that rebasing to -rc1 will be enough, then I will also try to believe I'm lucky. ;-) The kind of feedback I really want to get is e.g. about the extra per-device DMA-specific fields. If they cannot be added to struct device, then I'd rather start discussing an interim solution, because getting all existing DMA fields out of that struct will take a lot of time... Thanks, Petr T
On Wed, 26 Apr 2023 14:44:39 +0200 Petr Tesařík <petr@tesarici.cz> wrote: > Hi Greg, > > On Wed, 26 Apr 2023 14:26:36 +0200 > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > > Hi, > > > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > > SWIOTLB. > > > > > > > > The software IO TLB was designed with these assumptions: > > > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > > handle the few cases which require a bounce buffer. > > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > > system. > > > > > > > > First, if SEV is active, all DMA must be done through shared > > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > > changing device drivers. The software IO TLB size is increased to > > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > > approximation. The actual requirements may vary depending on the > > > > amount of I/O and which drivers are used. These factors may not be > > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > > > Second, other colleagues have noticed that they can reliably get > > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > > determining the right value puts additional burden on pre-release > > > > testing, which could be avoided if SWIOTLB is allocated small and > > > > grows only when necessary. > > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > > eager to get some feedback (positive or negative) and respin the next > > > version. > > > > It's the merge window, we can't add new things that haven't been in > > linux-next already. > > This is understood. I'm not asking for immediate inclusion. > > > Please resubmit it after -rc1 is out. > > If you can believe that rebasing to -rc1 will be enough, then I will > also try to believe I'm lucky. ;-) > > The kind of feedback I really want to get is e.g. about the extra > per-device DMA-specific fields. If they cannot be added to struct > device, then I'd rather start discussing an interim solution, because > getting all existing DMA fields out of that struct will take a lot of > time... All right, 6.4-rc1 is out now. The patch series still applies cleanly. Any comments what must be changed (if anything) to get it in? Thanks, Petr T
On Tue, May 09, 2023 at 09:16:35AM +0200, Petr Tesařík wrote: > On Wed, 26 Apr 2023 14:44:39 +0200 > Petr Tesařík <petr@tesarici.cz> wrote: > > > Hi Greg, > > > > On Wed, 26 Apr 2023 14:26:36 +0200 > > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > > > Hi, > > > > > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > > > SWIOTLB. > > > > > > > > > > The software IO TLB was designed with these assumptions: > > > > > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > > > handle the few cases which require a bounce buffer. > > > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > > > system. > > > > > > > > > > First, if SEV is active, all DMA must be done through shared > > > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > > > changing device drivers. The software IO TLB size is increased to > > > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > > > approximation. The actual requirements may vary depending on the > > > > > amount of I/O and which drivers are used. These factors may not be > > > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > > > > > Second, other colleagues have noticed that they can reliably get > > > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > > > determining the right value puts additional burden on pre-release > > > > > testing, which could be avoided if SWIOTLB is allocated small and > > > > > grows only when necessary. > > > > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > > > eager to get some feedback (positive or negative) and respin the next > > > > version. > > > > > > It's the merge window, we can't add new things that haven't been in > > > linux-next already. > > > > This is understood. I'm not asking for immediate inclusion. > > > > > Please resubmit it after -rc1 is out. > > > > If you can believe that rebasing to -rc1 will be enough, then I will > > also try to believe I'm lucky. ;-) > > > > The kind of feedback I really want to get is e.g. about the extra > > per-device DMA-specific fields. If they cannot be added to struct > > device, then I'd rather start discussing an interim solution, because > > getting all existing DMA fields out of that struct will take a lot of > > time... > > All right, 6.4-rc1 is out now. The patch series still applies cleanly. > > Any comments what must be changed (if anything) to get it in? Try resending it, it's long out of my review queue... thanks, greg k-h
Hi I was hoping this might land for 6.5-rc1, is there a new version that might apply against 6.5? Cheers Mike On Tue, 9 May 2023 at 08:32, Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Tue, May 09, 2023 at 09:16:35AM +0200, Petr Tesařík wrote: > > On Wed, 26 Apr 2023 14:44:39 +0200 > > Petr Tesařík <petr@tesarici.cz> wrote: > > > > > Hi Greg, > > > > > > On Wed, 26 Apr 2023 14:26:36 +0200 > > > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > > > > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > > > > Hi, > > > > > > > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > > > > SWIOTLB. > > > > > > > > > > > > The software IO TLB was designed with these assumptions: > > > > > > > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > > > > handle the few cases which require a bounce buffer. > > > > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > > > > system. > > > > > > > > > > > > First, if SEV is active, all DMA must be done through shared > > > > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > > > > changing device drivers. The software IO TLB size is increased to > > > > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > > > > approximation. The actual requirements may vary depending on the > > > > > > amount of I/O and which drivers are used. These factors may not be > > > > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > > > > > > > Second, other colleagues have noticed that they can reliably get > > > > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > > > > determining the right value puts additional burden on pre-release > > > > > > testing, which could be avoided if SWIOTLB is allocated small and > > > > > > grows only when necessary. > > > > > > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > > > > eager to get some feedback (positive or negative) and respin the next > > > > > version. > > > > > > > > It's the merge window, we can't add new things that haven't been in > > > > linux-next already. > > > > > > This is understood. I'm not asking for immediate inclusion. > > > > > > > Please resubmit it after -rc1 is out. > > > > > > If you can believe that rebasing to -rc1 will be enough, then I will > > > also try to believe I'm lucky. ;-) > > > > > > The kind of feedback I really want to get is e.g. about the extra > > > per-device DMA-specific fields. If they cannot be added to struct > > > device, then I'd rather start discussing an interim solution, because > > > getting all existing DMA fields out of that struct will take a lot of > > > time... > > > > All right, 6.4-rc1 is out now. The patch series still applies cleanly. > > > > Any comments what must be changed (if anything) to get it in? > > Try resending it, it's long out of my review queue... > > thanks, > > greg k-h
On Mon, 10 Jul 2023 23:23:45 +0100 Mike Lothian <mike@fireburn.co.uk> wrote: > Hi > > I was hoping this might land for 6.5-rc1, is there a new version that > might apply against 6.5? Yes, there is a v3, which is a complete rewrite based on feedback from various people on this mailing list: https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/ Petr T > Cheers > > Mike > > On Tue, 9 May 2023 at 08:32, Greg Kroah-Hartman > <gregkh@linuxfoundation.org> wrote: > > > > On Tue, May 09, 2023 at 09:16:35AM +0200, Petr Tesařík wrote: > > > On Wed, 26 Apr 2023 14:44:39 +0200 > > > Petr Tesařík <petr@tesarici.cz> wrote: > > > > > > > Hi Greg, > > > > > > > > On Wed, 26 Apr 2023 14:26:36 +0200 > > > > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > > > > > > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > > > > > Hi, > > > > > > > > > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > > > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > > > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > > > > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > > > > > SWIOTLB. > > > > > > > > > > > > > > The software IO TLB was designed with these assumptions: > > > > > > > > > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > > > > > handle the few cases which require a bounce buffer. > > > > > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > > > > > system. > > > > > > > > > > > > > > First, if SEV is active, all DMA must be done through shared > > > > > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > > > > > changing device drivers. The software IO TLB size is increased to > > > > > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > > > > > approximation. The actual requirements may vary depending on the > > > > > > > amount of I/O and which drivers are used. These factors may not be > > > > > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > > > > > > > > > Second, other colleagues have noticed that they can reliably get > > > > > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > > > > > determining the right value puts additional burden on pre-release > > > > > > > testing, which could be avoided if SWIOTLB is allocated small and > > > > > > > grows only when necessary. > > > > > > > > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > > > > > eager to get some feedback (positive or negative) and respin the next > > > > > > version. > > > > > > > > > > It's the merge window, we can't add new things that haven't been in > > > > > linux-next already. > > > > > > > > This is understood. I'm not asking for immediate inclusion. > > > > > > > > > Please resubmit it after -rc1 is out. > > > > > > > > If you can believe that rebasing to -rc1 will be enough, then I will > > > > also try to believe I'm lucky. ;-) > > > > > > > > The kind of feedback I really want to get is e.g. about the extra > > > > per-device DMA-specific fields. If they cannot be added to struct > > > > device, then I'd rather start discussing an interim solution, because > > > > getting all existing DMA fields out of that struct will take a lot of > > > > time... > > > > > > All right, 6.4-rc1 is out now. The patch series still applies cleanly. > > > > > > Any comments what must be changed (if anything) to get it in? > [...]
On Tue, 11 Jul 2023 at 10:03, Petr Tesařík <petr@tesarici.cz> wrote: > > On Mon, 10 Jul 2023 23:23:45 +0100 > Mike Lothian <mike@fireburn.co.uk> wrote: > > > Hi > > > > I was hoping this might land for 6.5-rc1, is there a new version that > > might apply against 6.5? > > Yes, there is a v3, which is a complete rewrite based on feedback from > various people on this mailing list: > > https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/ > > Petr T > Patch 2 doesn't apply cleanly for me on 6.5-rc1
On Tue, 11 Jul 2023 12:29:44 +0100 Mike Lothian <mike@fireburn.co.uk> wrote: > On Tue, 11 Jul 2023 at 10:03, Petr Tesařík <petr@tesarici.cz> wrote: > > > > On Mon, 10 Jul 2023 23:23:45 +0100 > > Mike Lothian <mike@fireburn.co.uk> wrote: > > > > > Hi > > > > > > I was hoping this might land for 6.5-rc1, is there a new version that > > > might apply against 6.5? > > > > Yes, there is a v3, which is a complete rewrite based on feedback from > > various people on this mailing list: > > > > https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/ > > > > Petr T > > > > > Patch 2 doesn't apply cleanly for me on 6.5-rc1 Ah, right. I'm going to rebase the series and include a few other suggested changes. I'm a bit worried that Christoph and all other maintainers (all taken back into Cc) have stayed silent about the v3 series. @Christoph: Are uncomfortable with something in the idea itself, or are you just busy with other things? Petr T
On Tue, 11 Jul 2023 at 14:21, Petr Tesařík <petr@tesarici.cz> wrote: > > On Tue, 11 Jul 2023 12:29:44 +0100 > Mike Lothian <mike@fireburn.co.uk> wrote: > > > On Tue, 11 Jul 2023 at 10:03, Petr Tesařík <petr@tesarici.cz> wrote: > > > > > > On Mon, 10 Jul 2023 23:23:45 +0100 > > > Mike Lothian <mike@fireburn.co.uk> wrote: > > > > > > > Hi > > > > > > > > I was hoping this might land for 6.5-rc1, is there a new version that > > > > might apply against 6.5? > > > > > > Yes, there is a v3, which is a complete rewrite based on feedback from > > > various people on this mailing list: > > > > > > https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/ > > > > > > Petr T > > > > > > > > > Patch 2 doesn't apply cleanly for me on 6.5-rc1 > > Ah, right. I'm going to rebase the series and include a few other > suggested changes. > > I'm a bit worried that Christoph and all other maintainers (all taken > back into Cc) have stayed silent about the v3 series. > > @Christoph: Are uncomfortable with something in the idea itself, or are > you just busy with other things? > > Petr T I imagine once 6.4 starts appearing in distros there will be more bugs appearing when people's wifi disconnects If this is being delayed until 6.6 would it be better to increase the default number, perhaps based on how much memory the system has and maybe back port the fix?
On Wed, Apr 26, 2023 at 02:44:39PM +0200, Petr Tesařík wrote: > Hi Greg, > > On Wed, 26 Apr 2023 14:26:36 +0200 > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > > Hi, > > > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > > SWIOTLB. > > > > > > > > The software IO TLB was designed with these assumptions: > > > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > > handle the few cases which require a bounce buffer. > > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > > system. > > > > > > > > First, if SEV is active, all DMA must be done through shared > > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > > changing device drivers. The software IO TLB size is increased to > > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > > approximation. The actual requirements may vary depending on the > > > > amount of I/O and which drivers are used. These factors may not be > > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > > > Second, other colleagues have noticed that they can reliably get > > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > > determining the right value puts additional burden on pre-release > > > > testing, which could be avoided if SWIOTLB is allocated small and > > > > grows only when necessary. > > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > > eager to get some feedback (positive or negative) and respin the next > > > version. > > > > It's the merge window, we can't add new things that haven't been in > > linux-next already. > > This is understood. I'm not asking for immediate inclusion. > > > Please resubmit it after -rc1 is out. > > If you can believe that rebasing to -rc1 will be enough, then I will > also try to believe I'm lucky. ;-) > > The kind of feedback I really want to get is e.g. about the extra > per-device DMA-specific fields. If they cannot be added to struct > device, then I'd rather start discussing an interim solution, because > getting all existing DMA fields out of that struct will take a lot of > time... I thought the goal was to get them out of the device and into the bus instead right? Or was it the other way around? I can't remember anymore, sorry... greg k-h
On Wed, 26 Apr 2023 14:53:52 +0200 Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > On Wed, Apr 26, 2023 at 02:44:39PM +0200, Petr Tesařík wrote: > > Hi Greg, > > > > On Wed, 26 Apr 2023 14:26:36 +0200 > > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote: > > > > Hi, > > > > > > > > On Wed, 19 Apr 2023 12:03:52 +0200 > > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > > > SWIOTLB. > > > > > > > > > > The software IO TLB was designed with these assumptions: > > > > > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > > > handle the few cases which require a bounce buffer. > > > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > > > system. > > > > > > > > > > First, if SEV is active, all DMA must be done through shared > > > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > > > changing device drivers. The software IO TLB size is increased to > > > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > > > approximation. The actual requirements may vary depending on the > > > > > amount of I/O and which drivers are used. These factors may not be > > > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > > > > > Second, other colleagues have noticed that they can reliably get > > > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > > > determining the right value puts additional burden on pre-release > > > > > testing, which could be avoided if SWIOTLB is allocated small and > > > > > grows only when necessary. > > > > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm > > > > eager to get some feedback (positive or negative) and respin the next > > > > version. > > > > > > It's the merge window, we can't add new things that haven't been in > > > linux-next already. > > > > This is understood. I'm not asking for immediate inclusion. > > > > > Please resubmit it after -rc1 is out. > > > > If you can believe that rebasing to -rc1 will be enough, then I will > > also try to believe I'm lucky. ;-) > > > > The kind of feedback I really want to get is e.g. about the extra > > per-device DMA-specific fields. If they cannot be added to struct > > device, then I'd rather start discussing an interim solution, because > > getting all existing DMA fields out of that struct will take a lot of > > time... > > I thought the goal was to get them out of the device and into the bus > instead right? Or was it the other way around? I can't remember > anymore, sorry... You remember it almost right. The goal is to get them out of struct device into the bus (or platform device, or whatever holds struct device). But I'd like to keep this task decoupled from the dynamic swiotlb. Thanks, Petr T
On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > The goal of my work is to provide more flexibility in the sizing of > SWIOTLB. > > The software IO TLB was designed with these assumptions: > > 1. It would not be used much, especially on 64-bit systems. > 2. A small fixed memory area (64 MiB by default) is sufficient to > handle the few cases which require a bounce buffer. > 3. 64 MiB is little enough that it has no impact on the rest of the > system. > > First, if SEV is active, all DMA must be done through shared > unencrypted pages, and SWIOTLB is used to make this happen without > changing device drivers. The software IO TLB size is increased to > 6% of total memory in sev_setup_arch(), but that is more of an > approximation. The actual requirements may vary depending on the > amount of I/O and which drivers are used. These factors may not be > know at boot time, i.e. when SWIOTLB is allocated. > > Second, other colleagues have noticed that they can reliably get > rid of occasional OOM kills on an Arm embedded device by reducing > the SWIOTLB size. This can be achieved with a kernel parameter, but > determining the right value puts additional burden on pre-release > testing, which could be avoided if SWIOTLB is allocated small and > grows only when necessary. > > Changes from v1-devel-v7: > - Add comments to acquire/release barriers > - Fix whitespace issues reported by checkpatch.pl > > Changes from v1-devel-v6: > - Provide long description of functions > - Fix kernel-doc (Returns: to Return:) > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() > > Changes from RFC: > - Track dynamic buffers per device instead of per swiotlb > - Use a linked list instead of a maple tree > - Move initialization of swiotlb fields of struct device to a > helper function > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() > - Introduce per-device flag if dynamic buffers are in use > - Add one more user of DMA_ATTR_MAY_SLEEP > - Add kernel-doc comments for new (and some old) code > - Properly escape '*' in dma-attributes.rst > > Petr Tesarik (7): > swiotlb: Use a helper to initialize swiotlb fields in struct device > swiotlb: Move code around in preparation for dynamic bounce buffers > dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute > swiotlb: Dynamically allocated bounce buffers > swiotlb: Add a boot option to enable dynamic bounce buffers > drm: Use DMA_ATTR_MAY_SLEEP from process context > swiotlb: per-device flag if there are dynamically allocated buffers > > .../admin-guide/kernel-parameters.txt | 6 +- > Documentation/core-api/dma-attributes.rst | 10 + > drivers/base/core.c | 4 +- > drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- > drivers/gpu/drm/drm_prime.c | 2 +- > include/linux/device.h | 12 + > include/linux/dma-mapping.h | 6 + > include/linux/swiotlb.h | 54 ++- > kernel/dma/swiotlb.c | 382 ++++++++++++++++-- > 9 files changed, 443 insertions(+), 35 deletions(-) > > -- > 2.25.1 > Hi Is this a potential fix for https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually setting bigger buffers to keep my wifi working? Thanks Mike
On Fri, 28 Apr 2023 09:53:38 +0100 Mike Lothian <mike@fireburn.co.uk> wrote: > On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > The goal of my work is to provide more flexibility in the sizing of > > SWIOTLB. > > > > The software IO TLB was designed with these assumptions: > > > > 1. It would not be used much, especially on 64-bit systems. > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > handle the few cases which require a bounce buffer. > > 3. 64 MiB is little enough that it has no impact on the rest of the > > system. > > > > First, if SEV is active, all DMA must be done through shared > > unencrypted pages, and SWIOTLB is used to make this happen without > > changing device drivers. The software IO TLB size is increased to > > 6% of total memory in sev_setup_arch(), but that is more of an > > approximation. The actual requirements may vary depending on the > > amount of I/O and which drivers are used. These factors may not be > > know at boot time, i.e. when SWIOTLB is allocated. > > > > Second, other colleagues have noticed that they can reliably get > > rid of occasional OOM kills on an Arm embedded device by reducing > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > determining the right value puts additional burden on pre-release > > testing, which could be avoided if SWIOTLB is allocated small and > > grows only when necessary. > > > > Changes from v1-devel-v7: > > - Add comments to acquire/release barriers > > - Fix whitespace issues reported by checkpatch.pl > > > > Changes from v1-devel-v6: > > - Provide long description of functions > > - Fix kernel-doc (Returns: to Return:) > > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() > > > > Changes from RFC: > > - Track dynamic buffers per device instead of per swiotlb > > - Use a linked list instead of a maple tree > > - Move initialization of swiotlb fields of struct device to a > > helper function > > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() > > - Introduce per-device flag if dynamic buffers are in use > > - Add one more user of DMA_ATTR_MAY_SLEEP > > - Add kernel-doc comments for new (and some old) code > > - Properly escape '*' in dma-attributes.rst > > > > Petr Tesarik (7): > > swiotlb: Use a helper to initialize swiotlb fields in struct device > > swiotlb: Move code around in preparation for dynamic bounce buffers > > dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute > > swiotlb: Dynamically allocated bounce buffers > > swiotlb: Add a boot option to enable dynamic bounce buffers > > drm: Use DMA_ATTR_MAY_SLEEP from process context > > swiotlb: per-device flag if there are dynamically allocated buffers > > > > .../admin-guide/kernel-parameters.txt | 6 +- > > Documentation/core-api/dma-attributes.rst | 10 + > > drivers/base/core.c | 4 +- > > drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- > > drivers/gpu/drm/drm_prime.c | 2 +- > > include/linux/device.h | 12 + > > include/linux/dma-mapping.h | 6 + > > include/linux/swiotlb.h | 54 ++- > > kernel/dma/swiotlb.c | 382 ++++++++++++++++-- > > 9 files changed, 443 insertions(+), 35 deletions(-) > > > > -- > > 2.25.1 > > > > Hi > > Is this a potential fix for > https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually > setting bigger buffers to keep my wifi working? Yes. With these patches applied, your system should run just fine with swiotlb=dynamic. However, keep in mind that this implementation adds a bit of overhead. In short, it trades a bit of performance for not having to figure out the optimal swiotlb size at boot time. Petr T
Hi I've not had any issues since using this, but I imagine most people won't know how to set swiotlb=dynamic if they start seeing this (when it lands) Any clue as to why this broke last cycle? Thanks Mike On Fri, 28 Apr 2023 at 10:07, Petr Tesařík <petr@tesarici.cz> wrote: > > On Fri, 28 Apr 2023 09:53:38 +0100 > Mike Lothian <mike@fireburn.co.uk> wrote: > > > On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote: > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com> > > > > > > The goal of my work is to provide more flexibility in the sizing of > > > SWIOTLB. > > > > > > The software IO TLB was designed with these assumptions: > > > > > > 1. It would not be used much, especially on 64-bit systems. > > > 2. A small fixed memory area (64 MiB by default) is sufficient to > > > handle the few cases which require a bounce buffer. > > > 3. 64 MiB is little enough that it has no impact on the rest of the > > > system. > > > > > > First, if SEV is active, all DMA must be done through shared > > > unencrypted pages, and SWIOTLB is used to make this happen without > > > changing device drivers. The software IO TLB size is increased to > > > 6% of total memory in sev_setup_arch(), but that is more of an > > > approximation. The actual requirements may vary depending on the > > > amount of I/O and which drivers are used. These factors may not be > > > know at boot time, i.e. when SWIOTLB is allocated. > > > > > > Second, other colleagues have noticed that they can reliably get > > > rid of occasional OOM kills on an Arm embedded device by reducing > > > the SWIOTLB size. This can be achieved with a kernel parameter, but > > > determining the right value puts additional burden on pre-release > > > testing, which could be avoided if SWIOTLB is allocated small and > > > grows only when necessary. > > > > > > Changes from v1-devel-v7: > > > - Add comments to acquire/release barriers > > > - Fix whitespace issues reported by checkpatch.pl > > > > > > Changes from v1-devel-v6: > > > - Provide long description of functions > > > - Fix kernel-doc (Returns: to Return:) > > > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() > > > > > > Changes from RFC: > > > - Track dynamic buffers per device instead of per swiotlb > > > - Use a linked list instead of a maple tree > > > - Move initialization of swiotlb fields of struct device to a > > > helper function > > > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() > > > - Introduce per-device flag if dynamic buffers are in use > > > - Add one more user of DMA_ATTR_MAY_SLEEP > > > - Add kernel-doc comments for new (and some old) code > > > - Properly escape '*' in dma-attributes.rst > > > > > > Petr Tesarik (7): > > > swiotlb: Use a helper to initialize swiotlb fields in struct device > > > swiotlb: Move code around in preparation for dynamic bounce buffers > > > dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute > > > swiotlb: Dynamically allocated bounce buffers > > > swiotlb: Add a boot option to enable dynamic bounce buffers > > > drm: Use DMA_ATTR_MAY_SLEEP from process context > > > swiotlb: per-device flag if there are dynamically allocated buffers > > > > > > .../admin-guide/kernel-parameters.txt | 6 +- > > > Documentation/core-api/dma-attributes.rst | 10 + > > > drivers/base/core.c | 4 +- > > > drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- > > > drivers/gpu/drm/drm_prime.c | 2 +- > > > include/linux/device.h | 12 + > > > include/linux/dma-mapping.h | 6 + > > > include/linux/swiotlb.h | 54 ++- > > > kernel/dma/swiotlb.c | 382 ++++++++++++++++-- > > > 9 files changed, 443 insertions(+), 35 deletions(-) > > > > > > -- > > > 2.25.1 > > > > > > > Hi > > > > Is this a potential fix for > > https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually > > setting bigger buffers to keep my wifi working? > > Yes. With these patches applied, your system should run just fine with > swiotlb=dynamic. However, keep in mind that this implementation adds a > bit of overhead. In short, it trades a bit of performance for not > having to figure out the optimal swiotlb size at boot time. > > Petr T
Hi Mike, On 5/1/2023 12:29 PM, Mike Lothian wrote: > Hi > > I've not had any issues since using this, but I imagine most people > won't know how to set swiotlb=dynamic if they start seeing this (when > it lands) > > Any clue as to why this broke last cycle? IIRC the kernel messages indicate high swiotlb utilization. The most likely cause is that either the wifi driver is using more DMA buffers at once, or another device driver is now using swiotlb; it could also be a driver for a device that was previously not supported. If you're adventurous, you can use ftrace to capture the backtraces of swiotlb_tbl_map_single. This would reveal the other driver. Debugging increased DMA buffer usage by your wifi driver may be more difficult, because that's a dynamic process, which may change dramatically if you add some sort of tracing into the picture... As always, YMMV. Petr T > Mike > > On Fri, 28 Apr 2023 at 10:07, Petr Tesařík <petr@tesarici.cz> wrote: >> >> On Fri, 28 Apr 2023 09:53:38 +0100 >> Mike Lothian <mike@fireburn.co.uk> wrote: >> >>> On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote: >>>> >>>> From: Petr Tesarik <petr.tesarik.ext@huawei.com> >>>> >>>> The goal of my work is to provide more flexibility in the sizing of >>>> SWIOTLB. >>>> >>>> The software IO TLB was designed with these assumptions: >>>> >>>> 1. It would not be used much, especially on 64-bit systems. >>>> 2. A small fixed memory area (64 MiB by default) is sufficient to >>>> handle the few cases which require a bounce buffer. >>>> 3. 64 MiB is little enough that it has no impact on the rest of the >>>> system. >>>> >>>> First, if SEV is active, all DMA must be done through shared >>>> unencrypted pages, and SWIOTLB is used to make this happen without >>>> changing device drivers. The software IO TLB size is increased to >>>> 6% of total memory in sev_setup_arch(), but that is more of an >>>> approximation. The actual requirements may vary depending on the >>>> amount of I/O and which drivers are used. These factors may not be >>>> know at boot time, i.e. when SWIOTLB is allocated. >>>> >>>> Second, other colleagues have noticed that they can reliably get >>>> rid of occasional OOM kills on an Arm embedded device by reducing >>>> the SWIOTLB size. This can be achieved with a kernel parameter, but >>>> determining the right value puts additional burden on pre-release >>>> testing, which could be avoided if SWIOTLB is allocated small and >>>> grows only when necessary. >>>> >>>> Changes from v1-devel-v7: >>>> - Add comments to acquire/release barriers >>>> - Fix whitespace issues reported by checkpatch.pl >>>> >>>> Changes from v1-devel-v6: >>>> - Provide long description of functions >>>> - Fix kernel-doc (Returns: to Return:) >>>> - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() >>>> >>>> Changes from RFC: >>>> - Track dynamic buffers per device instead of per swiotlb >>>> - Use a linked list instead of a maple tree >>>> - Move initialization of swiotlb fields of struct device to a >>>> helper function >>>> - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked() >>>> - Introduce per-device flag if dynamic buffers are in use >>>> - Add one more user of DMA_ATTR_MAY_SLEEP >>>> - Add kernel-doc comments for new (and some old) code >>>> - Properly escape '*' in dma-attributes.rst >>>> >>>> Petr Tesarik (7): >>>> swiotlb: Use a helper to initialize swiotlb fields in struct device >>>> swiotlb: Move code around in preparation for dynamic bounce buffers >>>> dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute >>>> swiotlb: Dynamically allocated bounce buffers >>>> swiotlb: Add a boot option to enable dynamic bounce buffers >>>> drm: Use DMA_ATTR_MAY_SLEEP from process context >>>> swiotlb: per-device flag if there are dynamically allocated buffers >>>> >>>> .../admin-guide/kernel-parameters.txt | 6 +- >>>> Documentation/core-api/dma-attributes.rst | 10 + >>>> drivers/base/core.c | 4 +- >>>> drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- >>>> drivers/gpu/drm/drm_prime.c | 2 +- >>>> include/linux/device.h | 12 + >>>> include/linux/dma-mapping.h | 6 + >>>> include/linux/swiotlb.h | 54 ++- >>>> kernel/dma/swiotlb.c | 382 ++++++++++++++++-- >>>> 9 files changed, 443 insertions(+), 35 deletions(-) >>>> >>>> -- >>>> 2.25.1 >>>> >>> >>> Hi >>> >>> Is this a potential fix for >>> https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually >>> setting bigger buffers to keep my wifi working? >> >> Yes. With these patches applied, your system should run just fine with >> swiotlb=dynamic. However, keep in mind that this implementation adds a >> bit of overhead. In short, it trades a bit of performance for not >> having to figure out the optimal swiotlb size at boot time. >> >> Petr T
© 2016 - 2025 Red Hat, Inc.