From nobody Sun Nov 24 14:29:25 2024 Received: from mx07-00376f01.pphosted.com (mx07-00376f01.pphosted.com [185.132.180.163]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9BFFF1A9B3F; Mon, 18 Nov 2024 13:29:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.132.180.163 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731936593; cv=none; b=AzpM6fqXkwYSuMw7r1SdOMXZIOZ1AUNI9gI5mATOR4wfbwvOZ3IrODOQsrkb228Jg8s4xd42226x7ZjRb8kRpL/vEZ0Bq8sXsVDJxRi2St8MyNu8PO/9yRZiRQUzcT7PlDRxuMFNnsFIobrARX9E7ImTHpKHRKiVOj+OKU+Titk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731936593; c=relaxed/simple; bh=fqYyXHPumIuy6+GbVwJYarv1fjdoowD8X4wyH3FYNA0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-ID:References: In-Reply-To:To:CC; b=A+nL3v2oVNdcjw0jAlhfUjFlrGFPAkW+/Mky4zDudXikwjabIf9W2K8JvIFguG/UbVEo5NM2M/voRQ/S0V1BU3hoNVdzQiCg0cH8Xe9pwntP/r9E4OwUEpQXfnXt+p1HjKq+MzzdHXOm/rh72tB+sJ/TWpEbCUGRMF/zZRk3QJc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=imgtec.com; spf=pass smtp.mailfrom=imgtec.com; dkim=pass (2048-bit key) header.d=imgtec.com header.i=@imgtec.com header.b=o3EadhPo; arc=none smtp.client-ip=185.132.180.163 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=imgtec.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=imgtec.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=imgtec.com header.i=@imgtec.com header.b="o3EadhPo" Received: from pps.filterd (m0168889.ppops.net [127.0.0.1]) by mx07-00376f01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4AICKePk029230; Mon, 18 Nov 2024 13:02:38 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=imgtec.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=dk201812; bh=W /CTGCf37YrKQ7xipA/eXKOe0Krs1+mYNDjHoF/2hM4=; b=o3EadhPolVqiRYqjp ljepmZrpGl1YFLGR/hheFO2i6I2n2oiae/Z16tsxJcmZqJKk4bqwJX3FwWwzQuga E+mvkbkzd0xe/TpVghIsyQFxt2c8rpfCmMmAgyWnBrhb4sbdaM/EpzVFwEmgeCD+ dll6NBeWwybhpofBO8ldK0MXpFCSSWslZuj7sB5hJ7E2cYJg0p4oQHXt7uNihAC2 T/t/dUMX/rJ3/0/wOYrA6wkqUsHUx8X7cjpHpKBUqomDuRDMwhG9vfQT+w2kCntH yh13oUARUG6TXQaNhqPoudtRcQFId3eYyUOqPP0u+lJfBkxXygSv/I5RJkp4mUDr ba9+A== Received: from hhmail05.hh.imgtec.org (83-244-153-141.cust-83.exponential-e.net [83.244.153.141]) by mx07-00376f01.pphosted.com (PPS) with ESMTPS id 42xmc1hmj2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Mon, 18 Nov 2024 13:02:38 +0000 (GMT) Received: from 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa (172.25.16.114) by HHMAIL05.hh.imgtec.org (10.100.10.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.37; Mon, 18 Nov 2024 13:02:37 +0000 From: Matt Coster Date: Mon, 18 Nov 2024 13:02:01 +0000 Subject: [PATCH v2 09/21] drm/imagination: Revert to non-threaded IRQs Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-ID: <20241118-sets-bxs-4-64-patch-v1-v2-9-3fd45d9fb0cf@imgtec.com> References: <20241118-sets-bxs-4-64-patch-v1-v2-0-3fd45d9fb0cf@imgtec.com> In-Reply-To: <20241118-sets-bxs-4-64-patch-v1-v2-0-3fd45d9fb0cf@imgtec.com> To: Frank Binns , Matt Coster , David Airlie , Simona Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Nishanth Menon , "Vignesh Raghavendra" , Tero Kristo CC: , , , , Randolph Sapp , Darren Etheridge X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=12417; i=matt.coster@imgtec.com; h=from:subject:message-id; bh=fqYyXHPumIuy6+GbVwJYarv1fjdoowD8X4wyH3FYNA0=; b=owGbwMvMwCFWuUfy8817WRsYT6slMaRbWz1VM5i7U3LtIs2VP3if7D/dFP5+D5tkRZh05p9VR QLSSyVmdpSyMIhxMMiKKbLsWGG5Qu2PmpbEjV/FMHNYmUCGMHBxCsBEyu4w/JVY6OPWeMORZS2L fjtr+aMZotdKzmjN+qKQqNjYtDlRt5GR4UADS/KzCSaRT+f4uou9uXXsWPRJnw08ambOT/zDzmv PYgEA X-Developer-Key: i=matt.coster@imgtec.com; a=openpgp; fpr=05A40CFCE7269D61D97100A1747F0A9036F90DFA X-EXCLAIMER-MD-CONFIG: 15a78312-3e47-46eb-9010-2e54d84a9631 X-Proofpoint-ORIG-GUID: ba_yTsHJh6CK94QP5HOnI4TAZYKhiXRz X-Authority-Analysis: v=2.4 cv=E4efprdl c=1 sm=1 tr=0 ts=673b3aee cx=c_pps a=AKOq//PuzOIrVTIF9yBwbA==:117 a=AKOq//PuzOIrVTIF9yBwbA==:17 a=hzDjp0mCheYA:10 a=IkcTkHD0fZMA:10 a=VlfZXiiP6vEA:10 a=VwQbUJbxAAAA:8 a=r_1tXGB3AAAA:8 a=1hQr5poaKPxSApStpg4A:9 a=QEXdDO2ut3YA:10 a=t8nPyN_e6usw4ciXM-Pk:22 X-Proofpoint-GUID: ba_yTsHJh6CK94QP5HOnI4TAZYKhiXRz For some reason, threaded IRQs do not play nice with the RISC-V firmware processor in BXS on our test platform (TI AM68). Until we can resolve this issue, revert to a more traditional workqueue- based IRQ implementation so the platform is at least functional. Signed-off-by: Matt Coster --- Changes in v2: - None - Link to v1: https://lore.kernel.org/r/20241105-sets-bxs-4-64-patch-v1-v1-= 9-4ed30e865892@imgtec.com --- drivers/gpu/drm/imagination/pvr_ccb.c | 25 +++++++-- drivers/gpu/drm/imagination/pvr_device.c | 94 ++++++++++------------------= ---- drivers/gpu/drm/imagination/pvr_device.h | 12 ++++ drivers/gpu/drm/imagination/pvr_fw.c | 1 + drivers/gpu/drm/imagination/pvr_power.c | 4 +- drivers/gpu/drm/imagination/pvr_queue.c | 23 +++++++- 6 files changed, 87 insertions(+), 72 deletions(-) diff --git a/drivers/gpu/drm/imagination/pvr_ccb.c b/drivers/gpu/drm/imagin= ation/pvr_ccb.c index 2bbdc05a3b97791426fd1f2ffe8f306d15622901..3bc3dc1bcc30ace720a09a2af5c= adc1c6ccfd93f 100644 --- a/drivers/gpu/drm/imagination/pvr_ccb.c +++ b/drivers/gpu/drm/imagination/pvr_ccb.c @@ -156,6 +156,13 @@ process_fwccb_command(struct pvr_device *pvr_dev, stru= ct rogue_fwif_fwccb_cmd *c } } =20 +static void pvr_fwccb_process_worker(struct work_struct *work) +{ + struct pvr_device *pvr_dev =3D container_of_const(work, struct pvr_device= , fwccb_work); + + pvr_fwccb_process(pvr_dev); +} + /** * pvr_fwccb_process() - Process any pending FWCCB commands * @pvr_dev: Target PowerVR device @@ -463,19 +470,16 @@ struct pvr_kccb_fence { }; =20 /** - * pvr_kccb_wake_up_waiters() - Check the KCCB waiters + * pvr_kccb_check_waiters() - Check the KCCB waiters * @pvr_dev: Target PowerVR device * * Signal as many KCCB fences as we have slots available. */ -void pvr_kccb_wake_up_waiters(struct pvr_device *pvr_dev) +static void pvr_kccb_check_waiters(struct pvr_device *pvr_dev) { struct pvr_kccb_fence *fence, *tmp_fence; u32 used_count, available_count; =20 - /* Wake up those waiting for KCCB slot execution. */ - wake_up_all(&pvr_dev->kccb.rtn_q); - /* Then iterate over all KCCB fences and signal as many as we can. */ mutex_lock(&pvr_dev->kccb.ccb.lock); used_count =3D pvr_kccb_used_slot_count_locked(pvr_dev); @@ -499,12 +503,20 @@ void pvr_kccb_wake_up_waiters(struct pvr_device *pvr_= dev) mutex_unlock(&pvr_dev->kccb.ccb.lock); } =20 +static void pvr_kccb_process_worker(struct work_struct *work) +{ + struct pvr_device *pvr_dev =3D container_of_const(work, struct pvr_device= , kccb.work); + + pvr_kccb_check_waiters(pvr_dev); +} + /** * pvr_kccb_fini() - Cleanup device KCCB * @pvr_dev: Target PowerVR device */ void pvr_kccb_fini(struct pvr_device *pvr_dev) { + cancel_work_sync(&pvr_dev->kccb.work); pvr_ccb_fini(&pvr_dev->kccb.ccb); WARN_ON(!list_empty(&pvr_dev->kccb.waiters)); WARN_ON(pvr_dev->kccb.reserved_count); @@ -525,6 +537,7 @@ pvr_kccb_init(struct pvr_device *pvr_dev) INIT_LIST_HEAD(&pvr_dev->kccb.waiters); pvr_dev->kccb.fence_ctx.id =3D dma_fence_context_alloc(1); spin_lock_init(&pvr_dev->kccb.fence_ctx.lock); + INIT_WORK(&pvr_dev->kccb.work, pvr_kccb_process_worker); =20 return pvr_ccb_init(pvr_dev, &pvr_dev->kccb.ccb, ROGUE_FWIF_KCCB_NUMCMDS_LOG2_DEFAULT, @@ -639,6 +652,8 @@ void pvr_kccb_release_slot(struct pvr_device *pvr_dev) int pvr_fwccb_init(struct pvr_device *pvr_dev) { + INIT_WORK(&pvr_dev->fwccb_work, pvr_fwccb_process_worker); + return pvr_ccb_init(pvr_dev, &pvr_dev->fwccb, ROGUE_FWIF_FWCCB_NUMCMDS_LOG2, sizeof(struct rogue_fwif_fwccb_cmd)); diff --git a/drivers/gpu/drm/imagination/pvr_device.c b/drivers/gpu/drm/ima= gination/pvr_device.c index 1704c0268589bdeb65fa6535f9ec63182b0a3e94..43411fe64fcecd8f84c0ceabb32= 9f2901d63ed93 100644 --- a/drivers/gpu/drm/imagination/pvr_device.c +++ b/drivers/gpu/drm/imagination/pvr_device.c @@ -120,72 +120,24 @@ static int pvr_device_clk_init(struct pvr_device *pvr= _dev) return 0; } =20 -/** - * pvr_device_process_active_queues() - Process all queue related events. - * @pvr_dev: PowerVR device to check - * - * This is called any time we receive a FW event. It iterates over all - * active queues and calls pvr_queue_process() on them. - */ -static void pvr_device_process_active_queues(struct pvr_device *pvr_dev) -{ - struct pvr_queue *queue, *tmp_queue; - LIST_HEAD(active_queues); - - mutex_lock(&pvr_dev->queues.lock); - - /* Move all active queues to a temporary list. Queues that remain - * active after we're done processing them are re-inserted to - * the queues.active list by pvr_queue_process(). - */ - list_splice_init(&pvr_dev->queues.active, &active_queues); - - list_for_each_entry_safe(queue, tmp_queue, &active_queues, node) - pvr_queue_process(queue); - - mutex_unlock(&pvr_dev->queues.lock); -} - -static irqreturn_t pvr_device_irq_thread_handler(int irq, void *data) +static irqreturn_t pvr_device_irq_handler(int irq, void *data) { struct pvr_device *pvr_dev =3D data; - irqreturn_t ret =3D IRQ_NONE; - - /* We are in the threaded handler, we can keep dequeuing events until we - * don't see any. This should allow us to reduce the number of interrupts - * when the GPU is receiving a massive amount of short jobs. - */ - while (pvr_fw_irq_pending(pvr_dev)) { - pvr_fw_irq_clear(pvr_dev); =20 - if (pvr_dev->fw_dev.booted) { - pvr_fwccb_process(pvr_dev); - pvr_kccb_wake_up_waiters(pvr_dev); - pvr_device_process_active_queues(pvr_dev); - } + if (!pvr_fw_irq_pending(pvr_dev)) + return IRQ_NONE; /* Spurious IRQ - ignore */ =20 - pm_runtime_mark_last_busy(from_pvr_device(pvr_dev)->dev); + pvr_fw_irq_clear(pvr_dev); =20 - ret =3D IRQ_HANDLED; + /* Only process IRQ work if FW is currently running */ + if (pvr_dev->fw_dev.booted) { + queue_work(pvr_dev->irq_wq, &pvr_dev->fwccb_work); + wake_up_all(&pvr_dev->kccb.rtn_q); + queue_work(pvr_dev->irq_wq, &pvr_dev->kccb.work); + queue_work(pvr_dev->irq_wq, &pvr_dev->queues.work); } =20 - /* Unmask FW irqs before returning, so new interrupts can be received. */ - pvr_fw_irq_enable(pvr_dev); - return ret; -} - -static irqreturn_t pvr_device_irq_handler(int irq, void *data) -{ - struct pvr_device *pvr_dev =3D data; - - if (!pvr_fw_irq_pending(pvr_dev)) - return IRQ_NONE; /* Spurious IRQ - ignore. */ - - /* Mask the FW interrupts before waking up the thread. Will be unmasked - * when the thread handler is done processing events. - */ - pvr_fw_irq_disable(pvr_dev); - return IRQ_WAKE_THREAD; + return IRQ_HANDLED; } =20 /** @@ -202,20 +154,33 @@ pvr_device_irq_init(struct pvr_device *pvr_dev) { struct drm_device *drm_dev =3D from_pvr_device(pvr_dev); struct platform_device *plat_dev =3D to_platform_device(drm_dev->dev); + int err; =20 init_waitqueue_head(&pvr_dev->kccb.rtn_q); =20 + pvr_dev->irq_wq =3D alloc_workqueue("powervr-irq", WQ_UNBOUND, 0); + if (!pvr_dev->irq_wq) { + err =3D -ENOMEM; + goto err_out; + } + pvr_dev->irq =3D platform_get_irq(plat_dev, 0); - if (pvr_dev->irq < 0) - return pvr_dev->irq; + if (pvr_dev->irq < 0) { + err =3D pvr_dev->irq; + goto err_destroy_wq; + } =20 /* Clear any pending events before requesting the IRQ line. */ pvr_fw_irq_clear(pvr_dev); pvr_fw_irq_enable(pvr_dev); =20 - return request_threaded_irq(pvr_dev->irq, pvr_device_irq_handler, - pvr_device_irq_thread_handler, - IRQF_SHARED, "gpu", pvr_dev); + return request_irq(pvr_dev->irq, pvr_device_irq_handler, 0, "gpu", pvr_de= v); + +err_destroy_wq: + destroy_workqueue(pvr_dev->irq_wq); + +err_out: + return err; } =20 /** @@ -226,6 +191,7 @@ static void pvr_device_irq_fini(struct pvr_device *pvr_dev) { free_irq(pvr_dev->irq, pvr_dev); + destroy_workqueue(pvr_dev->irq_wq); } =20 /** diff --git a/drivers/gpu/drm/imagination/pvr_device.h b/drivers/gpu/drm/ima= gination/pvr_device.h index 470945ccfcac7ce91161aa6c70d33177fbb3533f..930129334807168cf11ee843f44= 6eb5063464d55 100644 --- a/drivers/gpu/drm/imagination/pvr_device.h +++ b/drivers/gpu/drm/imagination/pvr_device.h @@ -141,9 +141,15 @@ struct pvr_device { /** @irq: IRQ number. */ int irq; =20 + /** @irq_wq: Workqueue for actions triggered off the IRQ handler. */ + struct workqueue_struct *irq_wq; + /** @fwccb: Firmware CCB. */ struct pvr_ccb fwccb; =20 + /** @fwccb_work: Work item for FWCCB processing. */ + struct work_struct fwccb_work; + /** * @kernel_vm_ctx: Virtual memory context used for kernel mappings. * @@ -210,6 +216,9 @@ struct pvr_device { /** @queues.lock: Lock protecting access to the active/idle * lists. */ struct mutex lock; + + /** @queues.work: Work item for queue processing. */ + struct work_struct work; } queues; =20 /** @@ -258,6 +267,9 @@ struct pvr_device { * future use. */ u32 reserved_count; =20 + /** @kccb.work: Work item for KCCB processing. */ + struct work_struct work; + /** * @kccb.waiters: List of KCCB slot waiters. */ diff --git a/drivers/gpu/drm/imagination/pvr_fw.c b/drivers/gpu/drm/imagina= tion/pvr_fw.c index 3debc9870a82ae7de9b2dc173df84c466c137bb3..9c8929d8602ead3390aa5c1b250= 5845b961b1406 100644 --- a/drivers/gpu/drm/imagination/pvr_fw.c +++ b/drivers/gpu/drm/imagination/pvr_fw.c @@ -1036,6 +1036,7 @@ pvr_fw_fini(struct pvr_device *pvr_dev) * Ensure FWCCB worker has finished executing before destroying FWCCB. Th= e IRQ handler has * been unregistered at this point so no new work should be being submitt= ed. */ + flush_work(&pvr_dev->fwccb_work); pvr_ccb_fini(&pvr_dev->fwccb); pvr_kccb_fini(pvr_dev); pvr_fw_cleanup(pvr_dev); diff --git a/drivers/gpu/drm/imagination/pvr_power.c b/drivers/gpu/drm/imag= ination/pvr_power.c index 19b079b357df78e8bcdecfa377fc9c05b6e8e4b0..000ded17f7ca4c5cdadc5aba4f4= d76fac0bbcc0c 100644 --- a/drivers/gpu/drm/imagination/pvr_power.c +++ b/drivers/gpu/drm/imagination/pvr_power.c @@ -122,7 +122,7 @@ pvr_power_fw_enable(struct pvr_device *pvr_dev) return err; } =20 - queue_delayed_work(pvr_dev->sched_wq, &pvr_dev->watchdog.work, + queue_delayed_work(pvr_dev->irq_wq, &pvr_dev->watchdog.work, msecs_to_jiffies(WATCHDOG_TIME_MS)); =20 return 0; @@ -212,7 +212,7 @@ pvr_watchdog_worker(struct work_struct *work) =20 out_requeue: if (!pvr_dev->lost) { - queue_delayed_work(pvr_dev->sched_wq, &pvr_dev->watchdog.work, + queue_delayed_work(pvr_dev->irq_wq, &pvr_dev->watchdog.work, msecs_to_jiffies(WATCHDOG_TIME_MS)); } } diff --git a/drivers/gpu/drm/imagination/pvr_queue.c b/drivers/gpu/drm/imag= ination/pvr_queue.c index c4f08432882b12f5cdfeb7fc991fd941f0946676..5f6a880987d9f35804da60e1258= efa30beed56ab 100644 --- a/drivers/gpu/drm/imagination/pvr_queue.c +++ b/drivers/gpu/drm/imagination/pvr_queue.c @@ -838,7 +838,7 @@ pvr_queue_timedout_job(struct drm_sched_job *s_job) } else { atomic_set(&queue->in_flight_job_count, job_count); list_move_tail(&queue->node, &pvr_dev->queues.active); - pvr_queue_process(queue); + queue_work(pvr_dev->irq_wq, &pvr_dev->queues.work); } mutex_unlock(&pvr_dev->queues.lock); =20 @@ -991,6 +991,26 @@ void pvr_queue_process(struct pvr_queue *queue) pvr_queue_update_active_state_locked(queue); } =20 +static void pvr_queue_process_worker(struct work_struct *work) +{ + struct pvr_device *pvr_dev =3D container_of_const(work, struct pvr_device= , queues.work); + struct pvr_queue *queue, *tmp_queue; + LIST_HEAD(active_queues); + + mutex_lock(&pvr_dev->queues.lock); + + list_splice_init(&pvr_dev->queues.active, &active_queues); + + list_for_each_entry_safe(queue, tmp_queue, &active_queues, node) { + pvr_queue_check_job_waiting_for_cccb_space(queue); + pvr_queue_signal_done_fences(queue); + + pvr_queue_update_active_state_locked(queue); + } + + mutex_unlock(&pvr_dev->queues.lock); +} + static u32 get_dm_type(struct pvr_queue *queue) { switch (queue->type) { @@ -1407,6 +1427,7 @@ int pvr_queue_device_init(struct pvr_device *pvr_dev) { int err; =20 + INIT_WORK(&pvr_dev->queues.work, pvr_queue_process_worker); INIT_LIST_HEAD(&pvr_dev->queues.active); INIT_LIST_HEAD(&pvr_dev->queues.idle); err =3D drmm_mutex_init(from_pvr_device(pvr_dev), &pvr_dev->queues.lock); --=20 2.47.0