[v2] platform/x86: intel_scu_ipc: Avoid working around IO and cleanups

[PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Andy Shevchenko 1 year, 3 months ago

The theory is that the so called workaround in pwr_reg_rdwr() is
the actual reader of the data in 32-bit chunks. For some reason
the 8-bit IO won't fail after that. Replace the workaround by using
32-bit IO explicitly and then memcpy() as much data as was requested
by the user. The same approach is already in use in
intel_scu_ipc_dev_command_with_size().

Tested-by: Ferry Toth <fntoth@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/platform/x86/intel_scu_ipc.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
index 5b16d29c93d7..290b38627542 100644
--- a/drivers/platform/x86/intel_scu_ipc.c
+++ b/drivers/platform/x86/intel_scu_ipc.c
@@ -217,12 +217,6 @@ static inline u8 ipc_read_status(struct intel_scu_ipc_dev *scu)
 	return __raw_readl(scu->ipc_base + IPC_STATUS);
 }
 
-/* Read ipc byte data */
-static inline u8 ipc_data_readb(struct intel_scu_ipc_dev *scu, u32 offset)
-{
-	return readb(scu->ipc_base + IPC_READ_BUFFER + offset);
-}
-
 /* Read ipc u32 data */
 static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset)
 {
@@ -325,11 +319,10 @@ static int pwr_reg_rdwr(struct intel_scu_ipc_dev *scu, u16 *addr, u8 *data,
 	}
 
 	err = intel_scu_ipc_check_status(scu);
-	if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */
-		/* Workaround: values are read as 0 without memcpy_fromio */
-		memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16);
-		for (nc = 0; nc < count; nc++)
-			data[nc] = ipc_data_readb(scu, nc);
+	if (!err) { /* Read rbuf */
+		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
+			wbuf[nc] = ipc_data_readl(scu, offset);
+		memcpy(data, wbuf, count);
 	}
 	mutex_unlock(&ipclock);
 	return err;
-- 
2.43.0.rc1.1336.g36b5255a03ac

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Ilpo Järvinen 1 year, 3 months ago

On Mon, 21 Oct 2024, Andy Shevchenko wrote:

> The theory is that the so called workaround in pwr_reg_rdwr() is
> the actual reader of the data in 32-bit chunks. For some reason
> the 8-bit IO won't fail after that. Replace the workaround by using
> 32-bit IO explicitly and then memcpy() as much data as was requested
> by the user. The same approach is already in use in
> intel_scu_ipc_dev_command_with_size().
>
> Tested-by: Ferry Toth <fntoth@gmail.com>
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> ---
>  drivers/platform/x86/intel_scu_ipc.c | 15 ++++-----------
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c
> index 5b16d29c93d7..290b38627542 100644
> --- a/drivers/platform/x86/intel_scu_ipc.c
> +++ b/drivers/platform/x86/intel_scu_ipc.c
> @@ -217,12 +217,6 @@ static inline u8 ipc_read_status(struct intel_scu_ipc_dev *scu)
>  	return __raw_readl(scu->ipc_base + IPC_STATUS);
>  }
>  
> -/* Read ipc byte data */
> -static inline u8 ipc_data_readb(struct intel_scu_ipc_dev *scu, u32 offset)
> -{
> -	return readb(scu->ipc_base + IPC_READ_BUFFER + offset);
> -}
> -
>  /* Read ipc u32 data */
>  static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset)
>  {
> @@ -325,11 +319,10 @@ static int pwr_reg_rdwr(struct intel_scu_ipc_dev *scu, u16 *addr, u8 *data,
>  	}
>  
>  	err = intel_scu_ipc_check_status(scu);
> -	if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */
> -		/* Workaround: values are read as 0 without memcpy_fromio */
> -		memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16);
> -		for (nc = 0; nc < count; nc++)
> -			data[nc] = ipc_data_readb(scu, nc);
> +	if (!err) { /* Read rbuf */

What is the reason for the removal of that id check? This seems a clear 
logic change but why? And if you remove want to remove that check, what 
that comment then means?

> +		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
> +			wbuf[nc] = ipc_data_readl(scu, offset);
> +		memcpy(data, wbuf, count);

So do we actually need to read more than
DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach 
used in intel_scu_ipc_dev_command_with_size() which you referred to.

>  	}
>  	mutex_unlock(&ipclock);
>  	return err;

FYI (unrelated to this patch), there seems to be some open-coded 
FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between 
those if branches too.

-- 
 i.

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Andy Shevchenko 1 year, 3 months ago

On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote:
> On Mon, 21 Oct 2024, Andy Shevchenko wrote:
> 
> > The theory is that the so called workaround in pwr_reg_rdwr() is
> > the actual reader of the data in 32-bit chunks. For some reason
> > the 8-bit IO won't fail after that. Replace the workaround by using
> > 32-bit IO explicitly and then memcpy() as much data as was requested
> > by the user. The same approach is already in use in
> > intel_scu_ipc_dev_command_with_size().

...

> >  	err = intel_scu_ipc_check_status(scu);
> > -	if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */
> > -		/* Workaround: values are read as 0 without memcpy_fromio */
> > -		memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16);
> > -		for (nc = 0; nc < count; nc++)
> > -			data[nc] = ipc_data_readb(scu, nc);
> > +	if (!err) { /* Read rbuf */
> 
> What is the reason for the removal of that id check? This seems a clear 
> logic change but why? And if you remove want to remove that check, what 
> that comment then means?

Let me split this to a separate change with better explanation then.

> > +		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
> > +			wbuf[nc] = ipc_data_readl(scu, offset);
> > +		memcpy(data, wbuf, count);
> 
> So do we actually need to read more than
> DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach 
> used in intel_scu_ipc_dev_command_with_size() which you referred to.

I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only
asked _bytes_ to the user.

> >  	}
> >  	mutex_unlock(&ipclock);
> >  	return err;
> 
> FYI (unrelated to this patch), there seems to be some open-coded 
> FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between 
> those if branches too.

This code is quite old and full of tricks that has to be tested. So, yes
while it's possible to convert, I would like to do it in a small (baby)
steps. This series is already quite intrusive from this perspective :-)

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Ilpo Järvinen 1 year, 3 months ago

On Mon, 21 Oct 2024, Andy Shevchenko wrote:

> On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote:
> > On Mon, 21 Oct 2024, Andy Shevchenko wrote:
> > 
> > > The theory is that the so called workaround in pwr_reg_rdwr() is
> > > the actual reader of the data in 32-bit chunks. For some reason
> > > the 8-bit IO won't fail after that. Replace the workaround by using
> > > 32-bit IO explicitly and then memcpy() as much data as was requested
> > > by the user. The same approach is already in use in
> > > intel_scu_ipc_dev_command_with_size().
> 
> ...
> 
> > >  	err = intel_scu_ipc_check_status(scu);
> > > -	if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */
> > > -		/* Workaround: values are read as 0 without memcpy_fromio */
> > > -		memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16);
> > > -		for (nc = 0; nc < count; nc++)
> > > -			data[nc] = ipc_data_readb(scu, nc);
> > > +	if (!err) { /* Read rbuf */
> > 
> > What is the reason for the removal of that id check? This seems a clear 
> > logic change but why? And if you remove want to remove that check, what 
> > that comment then means?
> 
> Let me split this to a separate change with better explanation then.
> 
> > > +		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
> > > +			wbuf[nc] = ipc_data_readl(scu, offset);
> > > +		memcpy(data, wbuf, count);
> > 
> > So do we actually need to read more than
> > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach 
> > used in intel_scu_ipc_dev_command_with_size() which you referred to.
> 
> I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only
> asked _bytes_ to the user.

So always reading 16 bytes is not part of the old workaround? Because it 
has a "lets read enough" feel.

> > >  	}
> > >  	mutex_unlock(&ipclock);
> > >  	return err;
> > 
> > FYI (unrelated to this patch), there seems to be some open-coded 
> > FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between 
> > those if branches too.
> 
> This code is quite old and full of tricks that has to be tested. So, yes
> while it's possible to convert, I would like to do it in a small (baby)
> steps. This series is already quite intrusive from this perspective :-)

Yeah, no pressure, I just noted down what I saw. :-)

-- 
 i.

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Andy Shevchenko 1 year, 3 months ago

On Mon, Oct 21, 2024 at 12:49:08PM +0300, Ilpo Järvinen wrote:
> On Mon, 21 Oct 2024, Andy Shevchenko wrote:
> > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote:
> > > On Mon, 21 Oct 2024, Andy Shevchenko wrote:

...

> > > > +		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
> > > > +			wbuf[nc] = ipc_data_readl(scu, offset);
> > > > +		memcpy(data, wbuf, count);
> > > 
> > > So do we actually need to read more than
> > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach 
> > > used in intel_scu_ipc_dev_command_with_size() which you referred to.
> > 
> > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only
> > asked _bytes_ to the user.
> 
> So always reading 16 bytes is not part of the old workaround? Because it 
> has a "lets read enough" feel.

Ah, now I got it! Yes, we may reduce the reads to just needed ones.
The idea is that we always have to perform 32-bit reads independently
on the amount of data we want.

> > > >  	}
> > > >  	mutex_unlock(&ipclock);
> > > >  	return err;
> > > 
> > > FYI (unrelated to this patch), there seems to be some open-coded 
> > > FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between 
> > > those if branches too.
> > 
> > This code is quite old and full of tricks that has to be tested. So, yes
> > while it's possible to convert, I would like to do it in a small (baby)
> > steps. This series is already quite intrusive from this perspective :-)
> 
> Yeah, no pressure, I just noted down what I saw. :-)

Thanks, I will keep this.

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Andy Shevchenko 1 year, 3 months ago

On Mon, Oct 21, 2024 at 12:54:16PM +0300, Andy Shevchenko wrote:
> On Mon, Oct 21, 2024 at 12:49:08PM +0300, Ilpo Järvinen wrote:
> > On Mon, 21 Oct 2024, Andy Shevchenko wrote:
> > > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote:
> > > > On Mon, 21 Oct 2024, Andy Shevchenko wrote:

...

> > > > > +		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
> > > > > +			wbuf[nc] = ipc_data_readl(scu, offset);
> > > > > +		memcpy(data, wbuf, count);
> > > > 
> > > > So do we actually need to read more than
> > > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach 
> > > > used in intel_scu_ipc_dev_command_with_size() which you referred to.
> > > 
> > > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only
> > > asked _bytes_ to the user.
> > 
> > So always reading 16 bytes is not part of the old workaround? Because it 
> > has a "lets read enough" feel.
> 
> Ah, now I got it! Yes, we may reduce the reads to just needed ones.
> The idea is that we always have to perform 32-bit reads independently
> on the amount of data we want.

Oh, looking at the code (*) it seems they are really messed up in the original
with bytes vs. 32-bit words! Since the above has been tested, let me put this
on TODO list to clarify this mess and run with another testing.

Sounds good to you?

*) the mythical comment about max 5 items for 20-byte buffer is worrying and
now I know why,

-- 
With Best Regards,
Andy Shevchenko

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Ilpo Järvinen 1 year, 3 months ago

On Mon, 21 Oct 2024, Andy Shevchenko wrote:

> On Mon, Oct 21, 2024 at 12:54:16PM +0300, Andy Shevchenko wrote:
> > On Mon, Oct 21, 2024 at 12:49:08PM +0300, Ilpo Järvinen wrote:
> > > On Mon, 21 Oct 2024, Andy Shevchenko wrote:
> > > > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote:
> > > > > On Mon, 21 Oct 2024, Andy Shevchenko wrote:
> 
> ...
> 
> > > > > > +		for (nc = 0, offset = 0; nc < 4; nc++, offset += 4)
> > > > > > +			wbuf[nc] = ipc_data_readl(scu, offset);
> > > > > > +		memcpy(data, wbuf, count);
> > > > > 
> > > > > So do we actually need to read more than
> > > > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach 
> > > > > used in intel_scu_ipc_dev_command_with_size() which you referred to.
> > > > 
> > > > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only
> > > > asked _bytes_ to the user.
> > > 
> > > So always reading 16 bytes is not part of the old workaround? Because it 
> > > has a "lets read enough" feel.
> > 
> > Ah, now I got it! Yes, we may reduce the reads to just needed ones.
> > The idea is that we always have to perform 32-bit reads independently
> > on the amount of data we want.
> 
> Oh, looking at the code (*) it seems they are really messed up in the original
> with bytes vs. 32-bit words! Since the above has been tested, let me put this
> on TODO list to clarify this mess and run with another testing.
> 
> Sounds good to you?

Sure, I'm fine with taking the careful approach.

> *) the mythical comment about max 5 items for 20-byte buffer is worrying and
> now I know why,

Those functions with that comment seem to only be called from 
scu_reg_access() which error checks count > 4.

-- 
 i.

Re: [PATCH v2 1/3] platform/x86: intel_scu_ipc: Replace workaround by 32-bit IO

Posted by Mika Westerberg 1 year, 3 months ago

On Mon, Oct 21, 2024 at 11:38:51AM +0300, Andy Shevchenko wrote:
> The theory is that the so called workaround in pwr_reg_rdwr() is
> the actual reader of the data in 32-bit chunks. For some reason
> the 8-bit IO won't fail after that. Replace the workaround by using
> 32-bit IO explicitly and then memcpy() as much data as was requested
> by the user. The same approach is already in use in
> intel_scu_ipc_dev_command_with_size().
> 
> Tested-by: Ferry Toth <fntoth@gmail.com>
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>