RE: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music

David Rau posted 1 patch 2 years, 7 months ago
sound/soc/codecs/da7219-aad.c | 156 ++++++++++++++++++++++------------
sound/soc/codecs/da7219-aad.h |   3 +
2 files changed, 105 insertions(+), 54 deletions(-)
RE: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by David Rau 2 years, 7 months ago


-----Original Message-----
From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
Sent: Saturday, February 4, 2023 23:42
To: Mark Brown <broonie@kernel.org>
Cc: David Rau <david.rau.zg@renesas.com>; perex@perex.cz; lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music

On Thu, Feb 02, 2023 at 07:36:42PM +0000, Mark Brown wrote:
> 
> > > they have the potential to actually lock up are the 
> > > cancel_work_sync() calls but they were unchanged and the backtrace 
> > > you showed was showing the thread in the msleep().  My guess would 
> > > be that you've got systems where there are very frequent jack 
> > > detection events (potentiallly with broken accessories, or 
> > > possibly due to the ground switch putting things into the wrong 
> > > priority) and that the interrupt is firing again as soon as the 
> > > thread unmasks the primary interrupt which means it never actually stops running.
> 
> > That is what I strongly suspect is happening. I don't know why 
> > exactly the interrupt is firing continuously, but the hang is always in msleep().
> > One possibility might be that the event is actually a disconnect 
> > event, and that enabling and immediately disabling the ground switch 
> > causes another interrupt, which is then handled immediately, causing the hang.
> 
> Could be.  I'd be willing to guess that it's not just one event but 
> rather a stream of events of some kind.  Possibly if it's due to the 
> ground switch it's spuriously detecting a constant stream of button 
> presses for the affected systems, which don't produce any UI visible 
> result which would cause users to pull the accessory for whatever 
> reason?  Whatever's going on I bet it's broken accessories triggering it.
> 

> That seems to be unlikely. The average number of crashes per affected system is 1.92, which points to something the users are doing and less to a broken accessory. 
> We do observe crashes due to broken accessories, but in those cases the number of crashes per system tends to be much > higher.

> Anyway, below is a patch with a possible fix. Of course, I still don't know what the patch originally tried to fix, so it might not do much if anything good.
I added the software debouncing before insertion task to ensue the better compatibility of OMTP Jack. 
> For example, it keeps button detection in the interrupt handler to avoid dropping button events, so if spurious button detection as you suspected is indeed (part of) the problem we might still see a large number of interrupts.

> Guenter

Thanks a lot for your big efforts to implement the temporary fix and verifications.
Would you please let me know the average number of crashes per affected system if you rollback to the pervious fix?
Ref:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/sound/soc/codecs?id=2d969e8f35b1849a43156029a7a6e2943b89d0c0
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/sound/soc/codecs?id=06f5882122e3faa183d76c4ec2c92f4c38e2c7bb

David
---
From 81dbe47c94d8d97ce7919a5ec4d4269c55b56ae6 Mon Sep 17 00:00:00 2001
From: Guenter Roeck <linux@roeck-us.net>
Date: Thu, 2 Feb 2023 16:09:14 -0800
Subject: [RFC] ASoC: da7219: Prevent hung task errors in interrupt handler

Commit 969357ec94e6 ("ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music") tried to improve pole orientation on certain headsets. Part of the change was to add a long sleep in the beginning of the interrupt handler, followed by enabling the ground switch

Unfortunately, this results in hung tasks in the threaded interrupt handler.

INFO: task irq/105-da7219-:2556 blocked for more than 122 seconds.
Not tainted 5.10.159-20945-g4390861bfc33 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:irq/105-da7219- state:D stack: 0 pid: 2556 ppid: 2 flags:0x00004080 Call Trace:
 __schedule+0x3b0/0xddb
 schedule+0x44/0xa8
 schedule_timeout+0xae/0x241
 ? run_local_timers+0x4e/0x4e
 msleep+0x2c/0x38
 da7219_aad_irq_thread+0x66/0x2b0
 irq_thread_fn+0x22/0x4d
 irq_thread+0x131/0x1cb
 ? irq_forced_thread_fn+0x5f/0x5f
 ? irq_thread_fn+0x4d/0x4d
 kthread+0x142/0x153
 ? synchronize_irq+0xe0/0xe0
 ? kthread_blkcg+0x31/0x31
 ret_from_fork+0x1f/0x30

Solve the problem by enabling the ground switch immediately and only after an insertion has been detected. Delay pole orientation detection until after the chip reports that detection is complete plus an additional time depending on the chip configuration. Do this by implementing ground switch detection in a delayed worker.

Fixes: 969357ec94e6 ("ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music")
Cc: David Rau <david.rau.zg@renesas.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 sound/soc/codecs/da7219-aad.c | 156 ++++++++++++++++++++++------------
 sound/soc/codecs/da7219-aad.h |   3 +
 2 files changed, 105 insertions(+), 54 deletions(-)

diff --git a/sound/soc/codecs/da7219-aad.c b/sound/soc/codecs/da7219-aad.c index c55b033d89da..47685c996bda 100644
--- a/sound/soc/codecs/da7219-aad.c
+++ b/sound/soc/codecs/da7219-aad.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/module.h>
+#include <linux/mutex.h>
 #include <linux/platform_device.h>
 #include <linux/clk.h>
 #include <linux/i2c.h>
@@ -339,6 +340,82 @@ static void da7219_aad_hptest_work(struct work_struct *work)
 				    SND_JACK_HEADSET | SND_JACK_LINEOUT);  }
 
+static void da7219_aad_handle_removal(struct da7219_aad_priv 
+*da7219_aad) {
+	struct snd_soc_component *component = da7219_aad->component;
+	struct snd_soc_dapm_context *dapm = snd_soc_component_get_dapm(component);
+	struct da7219_priv *da7219 = snd_soc_component_get_drvdata(component);
+
+	da7219_aad->jack_inserted = false;
+
+	/* Cancel any pending work */
+	cancel_work_sync(&da7219_aad->btn_det_work);
+	cancel_work_sync(&da7219_aad->hptest_work);
+
+	/* Un-drive headphones/lineout */
+	snd_soc_component_update_bits(component, DA7219_HP_R_CTRL,
+				      DA7219_HP_R_AMP_OE_MASK, 0);
+	snd_soc_component_update_bits(component, DA7219_HP_L_CTRL,
+				      DA7219_HP_L_AMP_OE_MASK, 0);
+
+	/* Ensure button detection disabled */
+	snd_soc_component_update_bits(component, DA7219_ACCDET_CONFIG_1,
+				      DA7219_BUTTON_CONFIG_MASK, 0);
+
+	da7219->micbias_on_event = false;
+
+	/* Disable mic bias */
+	snd_soc_dapm_disable_pin(dapm, "Mic Bias");
+	snd_soc_dapm_sync(dapm);
+
+	/* Disable ground switch */
+	snd_soc_component_update_bits(component, 0xFB, 0x01, 0x00);
+
+	snd_soc_jack_report(da7219_aad->jack, 0, DA7219_AAD_REPORT_ALL_MASK); 
+}
+
+static void da7219_aad_insertion_work(struct work_struct *work) {
+	struct da7219_aad_priv *da7219_aad =
+		container_of(work, struct da7219_aad_priv, hptest_work);
+	struct snd_soc_component *component = da7219_aad->component;
+	u8 statusa;
+
+	mutex_lock(&da7219_aad->insertion_mutex);
+
+	if (!da7219_aad->jack_inserted)
+		goto unlock;
+
+	/* Read status register for jack insertion & type status */
+	statusa = snd_soc_component_read(component, DA7219_ACCDET_STATUS_A);
+	if (!(statusa & DA7219_JACK_INSERTION_STS_MASK)) {
+		da7219_aad_handle_removal(da7219_aad);
+		goto unlock;
+	}
+
+	/*
+	 * If 4-pole, then enable button detection, else perform
+	 * HP impedance test to determine output type to report.
+	 *
+	 * We schedule work here as the tasks themselves can
+	 * take time to complete, and in particular for hptest
+	 * we want to be able to check if the jack was removed
+	 * during the procedure as this will invalidate the
+	 * result. By doing this as work, the IRQ thread can
+	 * handle a removal, and we can check at the end of
+	 * hptest if we have a valid result or not.
+	 */
+	if (statusa & DA7219_JACK_TYPE_STS_MASK) {
+		schedule_work(&da7219_aad->btn_det_work);
+		snd_soc_jack_report(da7219_aad->jack, SND_JACK_HEADSET,
+				    SND_JACK_HEADSET | SND_JACK_LINEOUT);
+	} else {
+		schedule_work(&da7219_aad->hptest_work);
+	}
+
+unlock:
+	mutex_unlock(&da7219_aad->insertion_mutex);
+}
 
 /*
  * IRQ
@@ -348,23 +425,21 @@ static irqreturn_t da7219_aad_irq_thread(int irq, void *data)  {
 	struct da7219_aad_priv *da7219_aad = data;
 	struct snd_soc_component *component = da7219_aad->component;
-	struct snd_soc_dapm_context *dapm = snd_soc_component_get_dapm(component);
 	struct da7219_priv *da7219 = snd_soc_component_get_drvdata(component);
 	u8 events[DA7219_AAD_IRQ_REG_MAX];
-	u8 statusa, srm_st;
+	u8 statusa;
 	int i, report = 0, mask = 0;
 
-	srm_st = snd_soc_component_read(component, DA7219_PLL_SRM_STS) & DA7219_PLL_SRM_STS_MCLK;
-	msleep(da7219_aad->gnd_switch_delay * ((srm_st == 0x0) ? 2 : 1) - 4);
-	/* Enable ground switch */
-	snd_soc_component_update_bits(component, 0xFB, 0x01, 0x01);
+	mutex_lock(&da7219_aad->insertion_mutex);
 
 	/* Read current IRQ events */
 	regmap_bulk_read(da7219->regmap, DA7219_ACCDET_IRQ_EVENT_A,
 			 events, DA7219_AAD_IRQ_REG_MAX);
 
-	if (!events[DA7219_AAD_IRQ_REG_A] && !events[DA7219_AAD_IRQ_REG_B])
+	if (!events[DA7219_AAD_IRQ_REG_A] && !events[DA7219_AAD_IRQ_REG_B]) {
+		mutex_unlock(&da7219_aad->insertion_mutex);
 		return IRQ_NONE;
+	}
 
 	/* Read status register for jack insertion & type status */
 	statusa = snd_soc_component_read(component, DA7219_ACCDET_STATUS_A); @@ -378,36 +453,29 @@ static irqreturn_t da7219_aad_irq_thread(int irq, void *data)
 		statusa);
 
 	if (statusa & DA7219_JACK_INSERTION_STS_MASK) {
+		u8 srm_st;
+
+		srm_st = snd_soc_component_read(component, DA7219_PLL_SRM_STS) &
+							DA7219_PLL_SRM_STS_MCLK;
+
 		/* Jack Insertion */
 		if (events[DA7219_AAD_IRQ_REG_A] &
 		    DA7219_E_JACK_INSERTED_MASK) {
 			report |= SND_JACK_MECHANICAL;
 			mask |= SND_JACK_MECHANICAL;
 			da7219_aad->jack_inserted = true;
+
+			/* Enable ground switch */
+			snd_soc_component_update_bits(component, 0xFB, 0x01, 0x01);
 		}
 
 		/* Jack type detection */
 		if (events[DA7219_AAD_IRQ_REG_A] &
 		    DA7219_E_JACK_DETECT_COMPLETE_MASK) {
-			/*
-			 * If 4-pole, then enable button detection, else perform
-			 * HP impedance test to determine output type to report.
-			 *
-			 * We schedule work here as the tasks themselves can
-			 * take time to complete, and in particular for hptest
-			 * we want to be able to check if the jack was removed
-			 * during the procedure as this will invalidate the
-			 * result. By doing this as work, the IRQ thread can
-			 * handle a removal, and we can check at the end of
-			 * hptest if we have a valid result or not.
-			 */
-			if (statusa & DA7219_JACK_TYPE_STS_MASK) {
-				report |= SND_JACK_HEADSET;
-				mask |=	SND_JACK_HEADSET | SND_JACK_LINEOUT;
-				schedule_work(&da7219_aad->btn_det_work);
-			} else {
-				schedule_work(&da7219_aad->hptest_work);
-			}
+			int delay = da7219_aad->gnd_switch_delay *
+						((srm_st == 0x0) ? 2 : 1) - 4;
+
+			schedule_delayed_work(&da7219_aad->insertion_work, delay);
 		}
 
 		/* Button support for 4-pole jack */
@@ -431,40 +499,16 @@ static irqreturn_t da7219_aad_irq_thread(int irq, void *data)
 				}
 			}
 		}
+		snd_soc_jack_report(da7219_aad->jack, report, mask);
 	} else {
 		/* Jack removal */
 		if (events[DA7219_AAD_IRQ_REG_A] & DA7219_E_JACK_REMOVED_MASK) {
-			report = 0;
-			mask |= DA7219_AAD_REPORT_ALL_MASK;
-			da7219_aad->jack_inserted = false;
-
-			/* Cancel any pending work */
-			cancel_work_sync(&da7219_aad->btn_det_work);
-			cancel_work_sync(&da7219_aad->hptest_work);
-
-			/* Un-drive headphones/lineout */
-			snd_soc_component_update_bits(component, DA7219_HP_R_CTRL,
-					    DA7219_HP_R_AMP_OE_MASK, 0);
-			snd_soc_component_update_bits(component, DA7219_HP_L_CTRL,
-					    DA7219_HP_L_AMP_OE_MASK, 0);
-
-			/* Ensure button detection disabled */
-			snd_soc_component_update_bits(component, DA7219_ACCDET_CONFIG_1,
-					    DA7219_BUTTON_CONFIG_MASK, 0);
-
-			da7219->micbias_on_event = false;
-
-			/* Disable mic bias */
-			snd_soc_dapm_disable_pin(dapm, "Mic Bias");
-			snd_soc_dapm_sync(dapm);
-
-			/* Disable ground switch */
-			snd_soc_component_update_bits(component, 0xFB, 0x01, 0x00);
+			cancel_delayed_work(&da7219_aad->insertion_work);
+			da7219_aad_handle_removal(da7219_aad);
 		}
 	}
 
-	snd_soc_jack_report(da7219_aad->jack, report, mask);
-
+	mutex_unlock(&da7219_aad->insertion_mutex);
 	return IRQ_HANDLED;
 }
 
@@ -938,6 +982,9 @@ int da7219_aad_init(struct snd_soc_component *component)
 	snd_soc_component_update_bits(component, DA7219_ACCDET_CONFIG_1,
 			    DA7219_BUTTON_CONFIG_MASK, 0);
 
+	mutex_init(&da7219_aad->insertion_mutex);
+
+	INIT_DELAYED_WORK(&da7219_aad->insertion_work, 
+da7219_aad_insertion_work);
 	INIT_WORK(&da7219_aad->btn_det_work, da7219_aad_btn_det_work);
 	INIT_WORK(&da7219_aad->hptest_work, da7219_aad_hptest_work);
 
@@ -973,6 +1020,7 @@ void da7219_aad_exit(struct snd_soc_component *component)
 
 	free_irq(da7219_aad->irq, da7219_aad);
 
+	cancel_delayed_work_sync(&da7219_aad->insertion_work);
 	cancel_work_sync(&da7219_aad->btn_det_work);
 	cancel_work_sync(&da7219_aad->hptest_work);
 }
diff --git a/sound/soc/codecs/da7219-aad.h b/sound/soc/codecs/da7219-aad.h index 21fdf53095cc..b1b7f8ba45bd 100644
--- a/sound/soc/codecs/da7219-aad.h
+++ b/sound/soc/codecs/da7219-aad.h
@@ -10,6 +10,7 @@
 #ifndef __DA7219_AAD_H
 #define __DA7219_AAD_H
 
+#include <linux/mutex.h>
 #include <linux/timer.h>
 #include <sound/soc.h>
 #include <sound/jack.h>
@@ -194,6 +195,8 @@ struct da7219_aad_priv {
 
 	u8 btn_cfg;
 
+	struct mutex insertion_mutex;
+	struct delayed_work insertion_work;
 	struct work_struct btn_det_work;
 	struct work_struct hptest_work;
 
--
2.39.1
Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by Guenter Roeck 2 years, 7 months ago
On 2/5/23 21:38, David Rau wrote:
> 
> 
> -----Original Message-----
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
> Sent: Saturday, February 4, 2023 23:42
> To: Mark Brown <broonie@kernel.org>
> Cc: David Rau <david.rau.zg@renesas.com>; perex@perex.cz; lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
> 
> On Thu, Feb 02, 2023 at 07:36:42PM +0000, Mark Brown wrote:
>>
>>>> they have the potential to actually lock up are the
>>>> cancel_work_sync() calls but they were unchanged and the backtrace
>>>> you showed was showing the thread in the msleep().  My guess would
>>>> be that you've got systems where there are very frequent jack
>>>> detection events (potentiallly with broken accessories, or
>>>> possibly due to the ground switch putting things into the wrong
>>>> priority) and that the interrupt is firing again as soon as the
>>>> thread unmasks the primary interrupt which means it never actually stops running.
>>
>>> That is what I strongly suspect is happening. I don't know why
>>> exactly the interrupt is firing continuously, but the hang is always in msleep().
>>> One possibility might be that the event is actually a disconnect
>>> event, and that enabling and immediately disabling the ground switch
>>> causes another interrupt, which is then handled immediately, causing the hang.
>>
>> Could be.  I'd be willing to guess that it's not just one event but
>> rather a stream of events of some kind.  Possibly if it's due to the
>> ground switch it's spuriously detecting a constant stream of button
>> presses for the affected systems, which don't produce any UI visible
>> result which would cause users to pull the accessory for whatever
>> reason?  Whatever's going on I bet it's broken accessories triggering it.
>>
> 
>> That seems to be unlikely. The average number of crashes per affected system is 1.92, which points to something the users are doing and less to a broken accessory.
>> We do observe crashes due to broken accessories, but in those cases the number of crashes per system tends to be much > higher.
> 
>> Anyway, below is a patch with a possible fix. Of course, I still don't know what the patch originally tried to fix, so it might not do much if anything good.
> I added the software debouncing before insertion task to ensue the better compatibility of OMTP Jack.
>> For example, it keeps button detection in the interrupt handler to avoid dropping button events, so if spurious button detection as you suspected is indeed (part of) the problem we might still see a large number of interrupts.
> 
>> Guenter
> 
> Thanks a lot for your big efforts to implement the temporary fix and verifications.
> Would you please let me know the average number of crashes per affected system if you rollback to the pervious fix?
> Ref:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/sound/soc/codecs?id=2d969e8f35b1849a43156029a7a6e2943b89d0c0
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/sound/soc/codecs?id=06f5882122e3faa183d76c4ec2c92f4c38e2c7bb
> 

You mean just keep the above two patches and revert 969357ec94e6 ?
Sure, I can do that, but feedback from the field would take some
2-3 months. Is that what you recommend to do for now ?

Thanks,
Guenter
RE: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by David Rau 2 years, 7 months ago

-----Original Message-----
From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
Sent: Monday, February 6, 2023 22:05
To: David Rau <david.rau.zg@renesas.com>; Mark Brown <broonie@kernel.org>
Cc: perex@perex.cz; lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music

On 2/5/23 21:38, David Rau wrote:
>> 
>> 
>> -----Original Message-----
>> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
>> Sent: Saturday, February 4, 2023 23:42
>> To: Mark Brown <broonie@kernel.org>
>> Cc: David Rau <david.rau.zg@renesas.com>; perex@perex.cz; 
>> lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; 
>> alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on 
>> OMTP headsets when playing music
>> 
>> On Thu, Feb 02, 2023 at 07:36:42PM +0000, Mark Brown wrote:
>>>
>>>>> they have the potential to actually lock up are the
>>>>> cancel_work_sync() calls but they were unchanged and the backtrace 
>>>>> you showed was showing the thread in the msleep().  My guess would 
>>>>> be that you've got systems where there are very frequent jack 
>>>>> detection events (potentiallly with broken accessories, or possibly 
>>>>> due to the ground switch putting things into the wrong
>>>>> priority) and that the interrupt is firing again as soon as the 
>>>>> thread unmasks the primary interrupt which means it never actually stops running.
>>>
>>>> That is what I strongly suspect is happening. I don't know why 
>>>> exactly the interrupt is firing continuously, but the hang is always in msleep().
>>>> One possibility might be that the event is actually a disconnect 
>>>> event, and that enabling and immediately disabling the ground switch 
>>>> causes another interrupt, which is then handled immediately, causing the hang.
>>>
>>> Could be.  I'd be willing to guess that it's not just one event but 
>>> rather a stream of events of some kind.  Possibly if it's due to the 
>>> ground switch it's spuriously detecting a constant stream of button 
>>> presses for the affected systems, which don't produce any UI visible 
>>> result which would cause users to pull the accessory for whatever 
>>> reason?  Whatever's going on I bet it's broken accessories triggering it.
>>>
> > 
>> That seems to be unlikely. The average number of crashes per affected system is 1.92, which points to something the users are doing and less to a broken accessory.
> >> We do observe crashes due to broken accessories, but in those cases the number of crashes per system tends to be much > higher.
> > 
> >> Anyway, below is a patch with a possible fix. Of course, I still don't know what the patch originally tried to fix, so it might not do much if anything good.
> > I added the software debouncing before insertion task to ensue the better compatibility of OMTP Jack.
> >> For example, it keeps button detection in the interrupt handler to avoid dropping button events, so if spurious button detection as you suspected is indeed (part of) the problem we might still see a large number of interrupts.
> > 
> >> Guenter
> > > 
> > Thanks a lot for your big efforts to implement the temporary fix and verifications.
> > Would you please let me know the average number of crashes per affected system if you rollback to the pervious fix?
> > Ref:
> > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
> > kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
> > Fcommit%2Fsound%2Fsoc%2Fcodecs%3Fid%3D2d969e8f35b1849a43156029a7a6e294
> > 3b89d0c0&data=05%7C01%7Cdavid.rau.zg%40renesas.com%7Cae6910f8ff4e4e299
> > bc408db084b1a2a%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638112890
> > 873388020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi
> > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8KgHP%2FOD%2BTDcr
> > rUVSATFkDCDDmhiCu7d5%2FKhyOszThA%3D&reserved=0
> > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
> > kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
> > Fcommit%2Fsound%2Fsoc%2Fcodecs%3Fid%3D06f5882122e3faa183d76c4ec2c92f4c
> > 38e2c7bb&data=05%7C01%7Cdavid.rau.zg%40renesas.com%7Cae6910f8ff4e4e299
> > bc408db084b1a2a%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638112890
> > 873388020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi
> > LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WosfvANk0YxeJD5PG
> > %2FnAuAWVqt7m4U3mMaYXefLLdS4%3D&reserved=0
> > 

> You mean just keep the above two patches and revert 969357ec94e6 ?
> Sure, I can do that, but feedback from the field would take some
> 2-3 months. Is that what you recommend to do for now ?

> Thanks,
> Guenter

Thanks for the feedback.
What I mean is just do rollback to remove the "sleep" patch I did in your repository.

After the rollback, could you please let me know the average number of crashes per affected system with the same test conditions?
Will it still take some 2-3 months?

Thanks.
David
Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by Guenter Roeck 2 years, 7 months ago
On Tue, Feb 07, 2023 at 02:42:14AM +0000, David Rau wrote:
> 
> > You mean just keep the above two patches and revert 969357ec94e6 ?
> > Sure, I can do that, but feedback from the field would take some
> > 2-3 months. Is that what you recommend to do for now ?
> 
> > Thanks,
> > Guenter
> 
> Thanks for the feedback.
> What I mean is just do rollback to remove the "sleep" patch I did in your repository.
> 
> After the rollback, could you please let me know the average number of crashes per affected system with the same test conditions?
> Will it still take some 2-3 months?
> 
The msleep() patch has been reverted in R111 and dev
releases of ChromeOS. I did not get permission to land
the revert in R110, meaning we'll continue to see the
crashes there. R111 is expected to go to Beta shortly,
so we should get _some_ feedback in the next few weeks.

Still, it would be great to get a more permanent fix
for the underlying problem. Also, the msleep() patch
is still upstream, so a solution is still needed there.

I can try to get and play with one of the affected
Chromebooks, but I don't think that would help much
since we still don't know what the undocumented ground
switch is supposed to do.

Thanks,
Guenter
RE: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by David Rau 2 years, 7 months ago

-----Original Message-----
From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
Sent: Thursday, February 9, 2023 02:04
To: David Rau <david.rau.zg@renesas.com>
Cc: Mark Brown <broonie@kernel.org>; perex@perex.cz; lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music

On Tue, Feb 07, 2023 at 02:42:14AM +0000, David Rau wrote:
>> 
>> > You mean just keep the above two patches and revert 969357ec94e6 ?
>> > Sure, I can do that, but feedback from the field would take some
>> > 2-3 months. Is that what you recommend to do for now ?
>> 
>> > Thanks,
>> > Guenter
>> 
>> Thanks for the feedback.
>> What I mean is just do rollback to remove the "sleep" patch I did in your repository.
>> 
>> After the rollback, could you please let me know the average number of crashes per affected system with the same test conditions?
>> Will it still take some 2-3 months?
>> 
>The msleep() patch has been reverted in R111 and dev releases of ChromeOS. I did not get permission to land the revert in R110, meaning we'll continue to see the crashes there. 
>R111 is expected to go to Beta shortly, so we should get _some_ feedback in the next few weeks.
>Still, it would be great to get a more permanent fix for the underlying problem. Also, the msleep() patch is still upstream, so a solution is still needed there.
>I can try to get and play with one of the affected Chromebooks, but I don't think that would help much since we still don't know what the undocumented ground switch is supposed to do.
Enable the GND switch earlier is needed to ensure the stable and smooth Jack detection.

>Thanks,
>Guenter

Thanks for the kind explanation and feedback.
I am verifying another method which do the msleep() in the individual schedule work to avoid such crash issue.

Would you please let me know how to reproduce this crash phenomenon?
Then I can ensure the new solution is stronger and solve this problem as well.

Thanks.
David
Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by Guenter Roeck 2 years, 7 months ago
On 2/8/23 19:05, David Rau wrote:
> 
> 
> -----Original Message-----
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
> Sent: Thursday, February 9, 2023 02:04
> To: David Rau <david.rau.zg@renesas.com>
> Cc: Mark Brown <broonie@kernel.org>; perex@perex.cz; lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
> 
> On Tue, Feb 07, 2023 at 02:42:14AM +0000, David Rau wrote:
>>>
>>>> You mean just keep the above two patches and revert 969357ec94e6 ?
>>>> Sure, I can do that, but feedback from the field would take some
>>>> 2-3 months. Is that what you recommend to do for now ?
>>>
>>>> Thanks,
>>>> Guenter
>>>
>>> Thanks for the feedback.
>>> What I mean is just do rollback to remove the "sleep" patch I did in your repository.
>>>
>>> After the rollback, could you please let me know the average number of crashes per affected system with the same test conditions?
>>> Will it still take some 2-3 months?
>>>
>> The msleep() patch has been reverted in R111 and dev releases of ChromeOS. I did not get permission to land the revert in R110, meaning we'll continue to see the crashes there.
>> R111 is expected to go to Beta shortly, so we should get _some_ feedback in the next few weeks.
>> Still, it would be great to get a more permanent fix for the underlying problem. Also, the msleep() patch is still upstream, so a solution is still needed there.
>> I can try to get and play with one of the affected Chromebooks, but I don't think that would help much since we still don't know what the undocumented ground switch is supposed to do.
> Enable the GND switch earlier is needed to ensure the stable and smooth Jack detection.
> 

"earlier" doesn't explain the changes, nor what "earlier" actually means.
The original code enabled the ground switch in the probe function,
only it never re-enabled it after the first insertion/removal sequence.
I don't know the potential impact on power consumption, but an easier
and more straightforward fix might be to (re-)enable the ground switch
on jack removal and to keep it enabled while nothing is inserted
(assuming that helps with detection).

The new (current) code enables the ground switch with each interrupt,
even if that interrupt is a button interrupt. That is really difficult
to understand and doesn't seem to make sense.

A diagram such as the one in Figure 20, describing exactly when the
ground switch is supposed to be enabled, would help a lot. It would
be even better if that diagram provided information about any extra
delays needed after e_jack_detect_complete is raised and before the
detection code actually runs.

>> Thanks,
>> Guenter
> 
> Thanks for the kind explanation and feedback.
> I am verifying another method which do the msleep() in the individual schedule work to avoid such crash issue.
> 
> Would you please let me know how to reproduce this crash phenomenon?
> Then I can ensure the new solution is stronger and solve this problem as well.
> 

Sorry, I can't, because the crash reports are from customers and automated.
We don't know what they are doing, only that whatever they are doing
causes the hang and thus crash.

Thanks,
Guenter
Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
Posted by Guenter Roeck 2 years, 7 months ago
On 2/6/23 18:42, David Rau wrote:
> 
> 
> -----Original Message-----
> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
> Sent: Monday, February 6, 2023 22:05
> To: David Rau <david.rau.zg@renesas.com>; Mark Brown <broonie@kernel.org>
> Cc: perex@perex.cz; lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com; alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on OMTP headsets when playing music
> 
> On 2/5/23 21:38, David Rau wrote:
>>>
>>>
>>> -----Original Message-----
>>> From: Guenter Roeck <groeck7@gmail.com> On Behalf Of Guenter Roeck
>>> Sent: Saturday, February 4, 2023 23:42
>>> To: Mark Brown <broonie@kernel.org>
>>> Cc: David Rau <david.rau.zg@renesas.com>; perex@perex.cz;
>>> lgirdwood@gmail.com; tiwai@suse.com; support.opensource@diasemi.com;
>>> alsa-devel@alsa-project.org; linux-kernel@vger.kernel.org
>>> Subject: Re: [PATCH] ASoC: da7219: Fix pole orientation detection on
>>> OMTP headsets when playing music
>>>
>>> On Thu, Feb 02, 2023 at 07:36:42PM +0000, Mark Brown wrote:
>>>>
>>>>>> they have the potential to actually lock up are the
>>>>>> cancel_work_sync() calls but they were unchanged and the backtrace
>>>>>> you showed was showing the thread in the msleep().  My guess would
>>>>>> be that you've got systems where there are very frequent jack
>>>>>> detection events (potentiallly with broken accessories, or possibly
>>>>>> due to the ground switch putting things into the wrong
>>>>>> priority) and that the interrupt is firing again as soon as the
>>>>>> thread unmasks the primary interrupt which means it never actually stops running.
>>>>
>>>>> That is what I strongly suspect is happening. I don't know why
>>>>> exactly the interrupt is firing continuously, but the hang is always in msleep().
>>>>> One possibility might be that the event is actually a disconnect
>>>>> event, and that enabling and immediately disabling the ground switch
>>>>> causes another interrupt, which is then handled immediately, causing the hang.
>>>>
>>>> Could be.  I'd be willing to guess that it's not just one event but
>>>> rather a stream of events of some kind.  Possibly if it's due to the
>>>> ground switch it's spuriously detecting a constant stream of button
>>>> presses for the affected systems, which don't produce any UI visible
>>>> result which would cause users to pull the accessory for whatever
>>>> reason?  Whatever's going on I bet it's broken accessories triggering it.
>>>>
>>>
>>> That seems to be unlikely. The average number of crashes per affected system is 1.92, which points to something the users are doing and less to a broken accessory.
>>>> We do observe crashes due to broken accessories, but in those cases the number of crashes per system tends to be much > higher.
>>>
>>>> Anyway, below is a patch with a possible fix. Of course, I still don't know what the patch originally tried to fix, so it might not do much if anything good.
>>> I added the software debouncing before insertion task to ensue the better compatibility of OMTP Jack.
>>>> For example, it keeps button detection in the interrupt handler to avoid dropping button events, so if spurious button detection as you suspected is indeed (part of) the problem we might still see a large number of interrupts.
>>>
>>>> Guenter
>>>>
>>> Thanks a lot for your big efforts to implement the temporary fix and verifications.
>>> Would you please let me know the average number of crashes per affected system if you rollback to the pervious fix?
>>> Ref:
>>> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
>>> kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
>>> Fcommit%2Fsound%2Fsoc%2Fcodecs%3Fid%3D2d969e8f35b1849a43156029a7a6e294
>>> 3b89d0c0&data=05%7C01%7Cdavid.rau.zg%40renesas.com%7Cae6910f8ff4e4e299
>>> bc408db084b1a2a%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638112890
>>> 873388020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi
>>> LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8KgHP%2FOD%2BTDcr
>>> rUVSATFkDCDDmhiCu7d5%2FKhyOszThA%3D&reserved=0
>>> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
>>> kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2
>>> Fcommit%2Fsound%2Fsoc%2Fcodecs%3Fid%3D06f5882122e3faa183d76c4ec2c92f4c
>>> 38e2c7bb&data=05%7C01%7Cdavid.rau.zg%40renesas.com%7Cae6910f8ff4e4e299
>>> bc408db084b1a2a%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638112890
>>> 873388020%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIi
>>> LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WosfvANk0YxeJD5PG
>>> %2FnAuAWVqt7m4U3mMaYXefLLdS4%3D&reserved=0
>>>
> 
>> You mean just keep the above two patches and revert 969357ec94e6 ?
>> Sure, I can do that, but feedback from the field would take some
>> 2-3 months. Is that what you recommend to do for now ?
> 
>> Thanks,
>> Guenter
> 
> Thanks for the feedback.
> What I mean is just do rollback to remove the "sleep" patch I did in your repository.
> 
> After the rollback, could you please let me know the average number of crashes per affected system with the same test conditions?
> Will it still take some 2-3 months?
> 

Yes, due to our rollout schedules. Those are crashes observed in the field,
after all.

Guenter