drivers/dma/dw/core.c | 144 ++++++++++++++++++++++-------------------- 1 file changed, 75 insertions(+), 69 deletions(-)
The main goal of the series is to fix the DW DMAC driver to be working better with the serial 8250 device driver implementation. In particular it was discovered that there is a random system freeze (caused by a deadlock) and an occasional "BUG: XFER bit set, but channel not idle" error printed to the log when the DW APB UART interface is used in conjunction with the DW DMA controller. Although I guess the problem can be found for any 8250 device using DW DMAC for the Tx/Rx-transfers execution. Anyway this short series contains two patches fixing these bugs. Please see the respective patches log for details. Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/ Changelog RFC: - Add a new patch: [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error fixing the "XFER bit set, but channel not idle" error. - Instead of just dropping the dwc_scan_descriptors() method invocation calculate the residue in the Tx-status getter. base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Cc: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: dmaengine@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-kernel@vger.kernel.org Serge Semin (2): dmaengine: dw: Prevent tx-status calling DMA-desc callback dmaengine: dw: Fix XFER bit set, but channel not idle error drivers/dma/dw/core.c | 144 ++++++++++++++++++++++-------------------- 1 file changed, 75 insertions(+), 69 deletions(-) -- 2.43.0
On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote: > The main goal of the series is to fix the DW DMAC driver to be working > better with the serial 8250 device driver implementation. In particular it > was discovered that there is a random system freeze (caused by a > deadlock) and an occasional "BUG: XFER bit set, but channel not idle" > error printed to the log when the DW APB UART interface is used in > conjunction with the DW DMA controller. Although I guess the problem can > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers > execution. Anyway this short series contains two patches fixing these > bugs. Please see the respective patches log for details. > > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/ > Changelog RFC: > - Add a new patch: > [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error > fixing the "XFER bit set, but channel not idle" error. > - Instead of just dropping the dwc_scan_descriptors() method invocation > calculate the residue in the Tx-status getter. FWIW, this series does not regress on Intel Merrifield (SPI case), Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> P.S. However it might need an additional tests for the DW UART based platforms. Cc'ed to Hans just in case (it might that he can add this to his repo for testing on Bay Trail and Cherry Trail that may have use of DW UART for BT operations). -- With Best Regards, Andy Shevchenko
Hi Andy On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote: > On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote: > > The main goal of the series is to fix the DW DMAC driver to be working > > better with the serial 8250 device driver implementation. In particular it > > was discovered that there is a random system freeze (caused by a > > deadlock) and an occasional "BUG: XFER bit set, but channel not idle" > > error printed to the log when the DW APB UART interface is used in > > conjunction with the DW DMA controller. Although I guess the problem can > > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers > > execution. Anyway this short series contains two patches fixing these > > bugs. Please see the respective patches log for details. > > > > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/ > > Changelog RFC: > > - Add a new patch: > > [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error > > fixing the "XFER bit set, but channel not idle" error. > > - Instead of just dropping the dwc_scan_descriptors() method invocation > > calculate the residue in the Tx-status getter. > > FWIW, this series does not regress on Intel Merrifield (SPI case), > Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Great! Thanks. > P.S. > However it might need an additional tests for the DW UART based platforms. > Cc'ed to Hans just in case (it might that he can add this to his repo for > testing on Bay Trail and Cherry Trail that may have use of DW UART for BT > operations). It's not enough though. The DW UART controller must be connected to the DW DMAC handshaking interface on the platform. The kernel must be properly setup for that too. In that case the test would be done on a proper target. Do the Bay Trail and Cherry Trail chips support such HW-setup? If so the additional test would be very welcome. Sometime ago you said that you seemed to meet a similar issue on older machines: https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/ If it's still possible could you please perform at least some smoke test on those devices? In case of my device this series and a previous one https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/ fixed all the critical issues for the DW UART + DW DMAC buddies: 1. Sudden data disappearing at the tail of the transfers (previous patch set). 2. Random system freeze (this patch set). There is another problem caused by the too slow coherent memory IO on my device. Due to that the data gets to be copied too slow in the __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast incoming traffic overflows the DW UART inbound FIFO. But that can be worked around by decreasing the Rx DMA-buffer size. (There are some more generic fixes possible, but they haven't shown to be as effective as the buffer size reduction.) -Serge(y) > > -- > With Best Regards, > Andy Shevchenko > > >
On Fri, Sep 20, 2024 at 12:33:51PM +0300, Serge Semin wrote: > On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote: > > On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote: > > > The main goal of the series is to fix the DW DMAC driver to be working > > > better with the serial 8250 device driver implementation. In particular it > > > was discovered that there is a random system freeze (caused by a > > > deadlock) and an occasional "BUG: XFER bit set, but channel not idle" > > > error printed to the log when the DW APB UART interface is used in > > > conjunction with the DW DMA controller. Although I guess the problem can > > > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers > > > execution. Anyway this short series contains two patches fixing these > > > bugs. Please see the respective patches log for details. > > > > > > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/ > > > Changelog RFC: > > > - Add a new patch: > > > [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error > > > fixing the "XFER bit set, but channel not idle" error. > > > - Instead of just dropping the dwc_scan_descriptors() method invocation > > > calculate the residue in the Tx-status getter. > > > FWIW, this series does not regress on Intel Merrifield (SPI case), > > Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > Great! Thanks. > > > P.S. > > However it might need an additional tests for the DW UART based platforms. > > Cc'ed to Hans just in case (it might that he can add this to his repo for > > testing on Bay Trail and Cherry Trail that may have use of DW UART for BT > > operations). > > It's not enough though. The DW UART controller must be connected to > the DW DMAC handshaking interface on the platform. The kernel must be > properly setup for that too. In that case the test would be done on > a proper target. Do the Bay Trail and Cherry Trail chips support such > HW-setup? If so the additional test would be very welcome. I'm not sure I understand what HW setup you mean. Bay Trail and Cherry Trail uses a shared DW DMA controller with number of peripheral devices, HS UART (also DW) is one of them. > Sometime ago you said that you seemed to meet a similar issue on older > machines: > https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/ > If it's still possible could you please perform at least some smoke > test on those devices? That mainly was exactly about Bay Trail and Cherry Trail machines (and may be Broadwell and Haswell, but the latter two is not so distributed nowadays). > In case of my device this series and a previous one > https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/ > fixed all the critical issues for the DW UART + DW DMAC buddies: > 1. Sudden data disappearing at the tail of the transfers (previous > patch set). > 2. Random system freeze (this patch set). > > There is another problem caused by the too slow coherent memory IO on > my device. Due to that the data gets to be copied too slow in the > __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast > incoming traffic overflows the DW UART inbound FIFO. But that can be > worked around by decreasing the Rx DMA-buffer size. (There are some > more generic fixes possible, but they haven't shown to be as effective > as the buffer size reduction.) This sounds like a specific quirk for a specific platform. In case you are going to address that make sure it does not come to be generic. -- With Best Regards, Andy Shevchenko
On Fri, Sep 20, 2024 at 05:24:37PM +0300, Andy Shevchenko wrote: > On Fri, Sep 20, 2024 at 12:33:51PM +0300, Serge Semin wrote: > > On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote: > > > On Wed, Sep 11, 2024 at 09:46:08PM +0300, Serge Semin wrote: > > > > The main goal of the series is to fix the DW DMAC driver to be working > > > > better with the serial 8250 device driver implementation. In particular it > > > > was discovered that there is a random system freeze (caused by a > > > > deadlock) and an occasional "BUG: XFER bit set, but channel not idle" > > > > error printed to the log when the DW APB UART interface is used in > > > > conjunction with the DW DMA controller. Although I guess the problem can > > > > be found for any 8250 device using DW DMAC for the Tx/Rx-transfers > > > > execution. Anyway this short series contains two patches fixing these > > > > bugs. Please see the respective patches log for details. > > > > > > > > Link: https://lore.kernel.org/dmaengine/20240802080756.7415-1-fancer.lancer@gmail.com/ > > > > Changelog RFC: > > > > - Add a new patch: > > > > [PATCH 2/2] dmaengine: dw: Fix XFER bit set, but channel not idle error > > > > fixing the "XFER bit set, but channel not idle" error. > > > > - Instead of just dropping the dwc_scan_descriptors() method invocation > > > > calculate the residue in the Tx-status getter. > > > > > FWIW, this series does not regress on Intel Merrifield (SPI case), > > > Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > > > Great! Thanks. > > > > > P.S. > > > However it might need an additional tests for the DW UART based platforms. > > > Cc'ed to Hans just in case (it might that he can add this to his repo for > > > testing on Bay Trail and Cherry Trail that may have use of DW UART for BT > > > operations). > > > > It's not enough though. The DW UART controller must be connected to > > the DW DMAC handshaking interface on the platform. The kernel must be > > properly setup for that too. In that case the test would be done on > > a proper target. Do the Bay Trail and Cherry Trail chips support such > > HW-setup? If so the additional test would be very welcome. > > I'm not sure I understand what HW setup you mean. I meant exactly what you explained in the next sentence - whether the Bay Trail and Cherry Trail have the DW UART capable to work with the DW DMAC. > > Bay Trail and Cherry Trail uses a shared DW DMA controller with number of > peripheral devices, HS UART (also DW) is one of them. Ok. Thanks. Testing the patch set on these platforms make sense then, but of course with the kernel configured to have the DW UART device handling the in-/outbound traffic by the DW DMA controller. > > > Sometime ago you said that you seemed to meet a similar issue on older > > machines: > > https://lore.kernel.org/dmaengine/CAHp75VdXqS6xqdsQCyhaMNLvzwkFn9HU8k9SLcT=KSwF9QPN4Q@mail.gmail.com/ > > If it's still possible could you please perform at least some smoke > > test on those devices? > > That mainly was exactly about Bay Trail and Cherry Trail machines > (and may be Broadwell and Haswell, but the latter two is not so > distributed nowadays). > > > In case of my device this series and a previous one > > https://lore.kernel.org/dmaengine/20240802075100.6475-1-fancer.lancer@gmail.com/ > > fixed all the critical issues for the DW UART + DW DMAC buddies: > > 1. Sudden data disappearing at the tail of the transfers (previous > > patch set). > > 2. Random system freeze (this patch set). > > > > There is another problem caused by the too slow coherent memory IO on > > my device. Due to that the data gets to be copied too slow in the > > __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast > > incoming traffic overflows the DW UART inbound FIFO. But that can be > > worked around by decreasing the Rx DMA-buffer size. (There are some > > more generic fixes possible, but they haven't shown to be as effective > > as the buffer size reduction.) > > This sounds like a specific quirk for a specific platform. In case you > are going to address that make sure it does not come to be generic. Of course reducing the buffer size is the platform-specific quirk. A more generic fix could be to convert the DMA-buffer to being allocated from the DMA-noncoherent memory _if_ the DMA performed by the DW DMA-device is non-coherent anyway. In that case the DMA-coherent memory buffer is normally allocated from the non-cacheable memory pool, access to which is very-very slow even on the Intel/AMD devices. So using the cacheable buffer for DMA, then manually invalidating the cache for it before DMA IOs and prefetching the data afterwards seemed as a more universal solution. But my tests showed that such approach doesn't fully solve the problem on my device. That said that approach permitted to execute data-safe UART transfers for up to 460Kbit/s, meanwhile just reducing the buffer from 16K to 512b - for up to 2.0Mbaud/s. It's still not enough since the device is capable to work on the speed 3Mbit/s, but it's better than 460Kbaud/s. -Serge(y) > > -- > With Best Regards, > Andy Shevchenko > >
On Fri, Sep 20, 2024 at 05:56:23PM +0300, Serge Semin wrote: > On Fri, Sep 20, 2024 at 05:24:37PM +0300, Andy Shevchenko wrote: > > On Fri, Sep 20, 2024 at 12:33:51PM +0300, Serge Semin wrote: > > > On Mon, Sep 16, 2024 at 04:01:08PM +0300, Andy Shevchenko wrote: ... > > > There is another problem caused by the too slow coherent memory IO on > > > my device. Due to that the data gets to be copied too slow in the > > > __dma_rx_complete()->tty_insert_flip_string() call. As a result a fast > > > incoming traffic overflows the DW UART inbound FIFO. But that can be > > > worked around by decreasing the Rx DMA-buffer size. (There are some > > > more generic fixes possible, but they haven't shown to be as effective > > > as the buffer size reduction.) > > > This sounds like a specific quirk for a specific platform. In case you > > are going to address that make sure it does not come to be generic. > > Of course reducing the buffer size is the platform-specific quirk. > > A more generic fix could be to convert the DMA-buffer to being > allocated from the DMA-noncoherent memory _if_ the DMA performed by > the DW DMA-device is non-coherent anyway. In that case the > DMA-coherent memory buffer is normally allocated from the > non-cacheable memory pool, access to which is very-very slow even on > the Intel/AMD devices. So using the cacheable buffer for DMA, then > manually invalidating the cache for it before DMA IOs and prefetching > the data afterwards seemed as a more universal solution. But my tests > showed that such approach doesn't fully solve the problem on my > device. That said that approach permitted to execute data-safe UART > transfers for up to 460Kbit/s, meanwhile just reducing the buffer from > 16K to 512b - for up to 2.0Mbaud/s. It's still not enough since the > device is capable to work on the speed 3Mbit/s, but it's better than > 460Kbaud/s. Ah, interesting issue. Good lick with solving it the best way you can. Any yes, you're right that 2M support is better than 0.5M. -- With Best Regards, Andy Shevchenko
© 2016 - 2024 Red Hat, Inc.