From nobody Fri Dec 19 21:50:27 2025 Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1AB1128817 for ; Tue, 5 Mar 2024 21:11:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709673084; cv=none; b=o0yG87NzZ/HDOtT4xstwR+ihxwEWDPGu1tcsM8bcf9blYBMvDD1XhRYFKnr5/X5dVRPN6OWX+mEAqR4WtiEGTqzQ5qtBxk2nzblOT+YLMzUl6J4nE/j7/AAGti+j0+G+4omItMxqBtwLEhZ2cVWs6wuNs5CfMryM5hgaz8lBlWY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709673084; c=relaxed/simple; bh=lmfdqhjM7l2mQEb9G+b9lr4aB4fp+gsGqqtW0bPeFOE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=UfYvZooeEVx9bPtpYpUO3MUUQP01Io9gGHTECpnSl0KD8lJjlC2lSMUakuewAQD0jSPIsak94YShhE8C0Am1xGGFJfEssyWEIfxvJcLWGkdm2N5vPQW3Ryo9ToTkoVy9pOqJfPdKNhai6CAY07+kFZ5/Y+0AH8V2zEgtIMB3rgU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=XL0YKIgb; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="XL0YKIgb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1709673079; bh=lmfdqhjM7l2mQEb9G+b9lr4aB4fp+gsGqqtW0bPeFOE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XL0YKIgbq0NWqwdnpxSd4KHKC3ZiEwKZ5O/qVNkcwVqUHArv/yvQWJS7+8Z3gSDmY NRIIOPvoxzbhb/Jp/0VbWkux7+igSl23195HVJE2LggEC/SP64A5rjhqzlv2Ub2MBt m5FFk14AF6DwT8p7ZOx7uLlnG5bgz64YrQ3FG99v/cjPXWW7H/bIIQhGdyJtl+KTNI 02QIGSajr6daKK+XeTKFEONkA/gNhhcYWKa2j6xQXGjGfO6zeh78iAbZtDtekcmbvx 167FN+rl7+w+/tc3wrD2XysDDYRAjYsx2H43W06Fl+Yk+5NSY3ehYp1XK1ObiG3EQT WUYK5reSEqmrg== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 3FFA53782066; Tue, 5 Mar 2024 21:11:19 +0000 (UTC) From: =?UTF-8?q?Adri=C3=A1n=20Larumbe?= To: boris.brezillon@collabora.com, steven.price@arm.com, liviu.dudau@arm.com, maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel@collabora.com Subject: [PATCH 1/2] drm/panthor: Enable fdinfo for cycle and time measurements Date: Tue, 5 Mar 2024 21:05:49 +0000 Message-ID: <20240305211000.659103-2-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240305211000.659103-1-adrian.larumbe@collabora.com> References: <20240305211000.659103-1-adrian.larumbe@collabora.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable These values are sampled by the firmware right before jumping into the UM command stream and immediately after returning from it, and then kept insid= e a per-job accounting structure. That structure is held inside the group's syn= cobjs buffer object, at an offset that depends on the job's queue slot number and= the queue's index within the group. Signed-off-by: Adri=C3=A1n Larumbe --- drivers/gpu/drm/panthor/panthor_devfreq.c | 10 + drivers/gpu/drm/panthor/panthor_device.h | 11 ++ drivers/gpu/drm/panthor/panthor_drv.c | 31 ++++ drivers/gpu/drm/panthor/panthor_sched.c | 217 +++++++++++++++++++--- 4 files changed, 241 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/panthor/panthor_devfreq.c b/drivers/gpu/drm/pa= nthor/panthor_devfreq.c index 7ac4fa290f27..51a7b734edcd 100644 --- a/drivers/gpu/drm/panthor/panthor_devfreq.c +++ b/drivers/gpu/drm/panthor/panthor_devfreq.c @@ -91,6 +91,7 @@ static int panthor_devfreq_get_dev_status(struct device *= dev, spin_lock_irqsave(&pdevfreq->lock, irqflags); =20 panthor_devfreq_update_utilization(pdevfreq); + ptdev->current_frequency =3D status->current_frequency; =20 status->total_time =3D ktime_to_ns(ktime_add(pdevfreq->busy_time, pdevfreq->idle_time)); @@ -130,6 +131,7 @@ int panthor_devfreq_init(struct panthor_device *ptdev) struct panthor_devfreq *pdevfreq; struct dev_pm_opp *opp; unsigned long cur_freq; + unsigned long freq =3D ULONG_MAX; int ret; =20 pdevfreq =3D drmm_kzalloc(&ptdev->base, sizeof(*ptdev->devfreq), GFP_KERN= EL); @@ -204,6 +206,14 @@ int panthor_devfreq_init(struct panthor_device *ptdev) =20 dev_pm_opp_put(opp); =20 + /* Find the fastest defined rate */ + opp =3D dev_pm_opp_find_freq_floor(dev, &freq); + if (IS_ERR(opp)) + return PTR_ERR(opp); + ptdev->fast_rate =3D freq; + + dev_pm_opp_put(opp); + /* * Setup default thresholds for the simple_ondemand governor. * The values are chosen based on experiments. diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/pan= thor/panthor_device.h index 51c9d61b6796..10e970921ca3 100644 --- a/drivers/gpu/drm/panthor/panthor_device.h +++ b/drivers/gpu/drm/panthor/panthor_device.h @@ -162,6 +162,14 @@ struct panthor_device { */ u32 *dummy_latest_flush; } pm; + + unsigned long current_frequency; + unsigned long fast_rate; +}; + +struct panthor_gpu_usage { + u64 time; + u64 cycles; }; =20 /** @@ -176,6 +184,9 @@ struct panthor_file { =20 /** @groups: Scheduling group pool attached to this file. */ struct panthor_group_pool *groups; + + /** @stats: cycle and timestamp measures for job execution. */ + struct panthor_gpu_usage stats; }; =20 int panthor_device_init(struct panthor_device *ptdev); diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/pantho= r/panthor_drv.c index ff484506229f..fa06b9e2c6cd 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -3,6 +3,10 @@ /* Copyright 2019 Linaro, Ltd., Rob Herring */ /* Copyright 2019 Collabora ltd. */ =20 +#ifdef CONFIG_HAVE_ARM_ARCH_TIMER +#include +#endif + #include #include #include @@ -28,6 +32,8 @@ #include "panthor_regs.h" #include "panthor_sched.h" =20 +#define NS_PER_SEC 1000000000ULL + /** * DOC: user <-> kernel object copy helpers. */ @@ -1336,6 +1342,29 @@ static int panthor_mmap(struct file *filp, struct vm= _area_struct *vma) return ret; } =20 +static void panthor_gpu_show_fdinfo(struct panthor_device *ptdev, + struct panthor_file *pfile, + struct drm_printer *p) +{ +#ifdef CONFIG_HAVE_ARM_ARCH_TIMER + drm_printf(p, "drm-engine-panthor:\t%llu ns\n", + DIV_ROUND_UP_ULL((pfile->stats.time * NS_PER_SEC), + arch_timer_get_cntfrq())); +#endif + drm_printf(p, "drm-cycles-panthor:\t%llu\n", pfile->stats.cycles); + drm_printf(p, "drm-maxfreq-panthor:\t%lu Hz\n", ptdev->fast_rate); + drm_printf(p, "drm-curfreq-panthor:\t%lu Hz\n", ptdev->current_frequency); +} + +static void panthor_show_fdinfo(struct drm_printer *p, struct drm_file *fi= le) +{ + struct drm_device *dev =3D file->minor->dev; + struct panthor_device *ptdev =3D container_of(dev, struct panthor_device,= base); + + panthor_gpu_show_fdinfo(ptdev, file->driver_priv, p); + +} + static const struct file_operations panthor_drm_driver_fops =3D { .open =3D drm_open, .release =3D drm_release, @@ -1345,6 +1374,7 @@ static const struct file_operations panthor_drm_drive= r_fops =3D { .read =3D drm_read, .llseek =3D noop_llseek, .mmap =3D panthor_mmap, + .show_fdinfo =3D drm_show_fdinfo, }; =20 #ifdef CONFIG_DEBUG_FS @@ -1363,6 +1393,7 @@ static const struct drm_driver panthor_drm_driver =3D= { DRIVER_SYNCOBJ_TIMELINE | DRIVER_GEM_GPUVA, .open =3D panthor_open, .postclose =3D panthor_postclose, + .show_fdinfo =3D panthor_show_fdinfo, .ioctls =3D panthor_drm_driver_ioctls, .num_ioctls =3D ARRAY_SIZE(panthor_drm_driver_ioctls), .fops =3D &panthor_drm_driver_fops, diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/pant= hor/panthor_sched.c index 5f7803b6fc48..751d1453e7a1 100644 --- a/drivers/gpu/drm/panthor/panthor_sched.c +++ b/drivers/gpu/drm/panthor/panthor_sched.c @@ -93,6 +93,8 @@ #define MIN_CSGS 3 #define MAX_CSG_PRIO 0xf =20 +#define SLOTSIZE (NUM_INSTRS_PER_SLOT * sizeof(u64)) + struct panthor_group; =20 /** @@ -393,7 +395,13 @@ struct panthor_queue { #define CSF_MAX_QUEUE_PRIO GENMASK(3, 0) =20 /** @ringbuf: Command stream ring-buffer. */ - struct panthor_kernel_bo *ringbuf; + struct { + /** @ringbuf: Kernel BO that holds ring buffer. */ + struct panthor_kernel_bo *bo; + + /** @nelem: Number of slots in the ring buffer. */ + unsigned int nelem; + } ringbuf; =20 /** @iface: Firmware interface. */ struct { @@ -466,6 +474,9 @@ struct panthor_queue { */ struct list_head in_flight_jobs; } fence_ctx; + + /** @time_offset: Offset of fdinfo stats structs in queue's syncobj. */ + unsigned long time_offset; }; =20 /** @@ -580,7 +591,26 @@ struct panthor_group { * One sync object per queue. The position of the sync object is * determined by the queue index. */ - struct panthor_kernel_bo *syncobjs; + + struct { + /** @bo: Kernel BO holding the sync objects. */ + struct panthor_kernel_bo *bo; + + /** @times_offset: Beginning of time stats after objects of sync pool. */ + size_t times_offset; + } syncobjs; + + /** @fdinfo: Per-file total cycle and timestamp values reference. */ + struct { + /** @data: Pointer to actual per-file sample data. */ + struct panthor_gpu_usage *data; + + /** + * @lock: Mutex to govern concurrent access from drm file's fdinfo callb= ack + * and job post-completion processing function + */ + struct mutex lock; + } fdinfo; =20 /** @state: Group state. */ enum panthor_group_state state; @@ -639,6 +669,18 @@ struct panthor_group { struct list_head wait_node; }; =20 +struct panthor_job_times { + struct { + u64 before; + u64 after; + } cycles; + + struct { + u64 before; + u64 after; + } time; +}; + /** * group_queue_work() - Queue a group work * @group: Group to queue the work for. @@ -718,6 +760,9 @@ struct panthor_job { /** @queue_idx: Index of the queue inside @group. */ u32 queue_idx; =20 + /** @ringbuf_idx: Index of the queue inside @queue. */ + u32 ringbuf_idx; + /** @call_info: Information about the userspace command stream call. */ struct { /** @start: GPU address of the userspace command stream. */ @@ -814,7 +859,7 @@ static void group_free_queue(struct panthor_group *grou= p, struct panthor_queue * =20 panthor_queue_put_syncwait_obj(queue); =20 - panthor_kernel_bo_destroy(group->vm, queue->ringbuf); + panthor_kernel_bo_destroy(group->vm, queue->ringbuf.bo); panthor_kernel_bo_destroy(panthor_fw_vm(group->ptdev), queue->iface.mem); =20 kfree(queue); @@ -828,12 +873,14 @@ static void group_release_work(struct work_struct *wo= rk) struct panthor_device *ptdev =3D group->ptdev; u32 i; =20 + mutex_destroy(&group->fdinfo.lock); + for (i =3D 0; i < group->queue_count; i++) group_free_queue(group, group->queues[i]); =20 panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), group->suspend_buf); panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), group->protm_suspend_buf); - panthor_kernel_bo_destroy(group->vm, group->syncobjs); + panthor_kernel_bo_destroy(group->vm, group->syncobjs.bo); =20 panthor_vm_put(group->vm); kfree(group); @@ -970,8 +1017,8 @@ cs_slot_prog_locked(struct panthor_device *ptdev, u32 = csg_id, u32 cs_id) queue->iface.input->extract =3D queue->iface.output->extract; drm_WARN_ON(&ptdev->base, queue->iface.input->insert < queue->iface.input= ->extract); =20 - cs_iface->input->ringbuf_base =3D panthor_kernel_bo_gpuva(queue->ringbuf); - cs_iface->input->ringbuf_size =3D panthor_kernel_bo_size(queue->ringbuf); + cs_iface->input->ringbuf_base =3D panthor_kernel_bo_gpuva(queue->ringbuf.= bo); + cs_iface->input->ringbuf_size =3D panthor_kernel_bo_size(queue->ringbuf.b= o); cs_iface->input->ringbuf_input =3D queue->iface.input_fw_va; cs_iface->input->ringbuf_output =3D queue->iface.output_fw_va; cs_iface->input->config =3D CS_CONFIG_PRIORITY(queue->priority) | @@ -1926,7 +1973,7 @@ tick_ctx_init(struct panthor_scheduler *sched, } } =20 -#define NUM_INSTRS_PER_SLOT 16 +#define NUM_INSTRS_PER_SLOT 32 =20 static void group_term_post_processing(struct panthor_group *group) @@ -1964,7 +2011,7 @@ group_term_post_processing(struct panthor_group *grou= p) spin_unlock(&queue->fence_ctx.lock); =20 /* Manually update the syncobj seqno to unblock waiters. */ - syncobj =3D group->syncobjs->kmap + (i * sizeof(*syncobj)); + syncobj =3D group->syncobjs.bo->kmap + (i * sizeof(*syncobj)); syncobj->status =3D ~0; syncobj->seqno =3D atomic64_read(&queue->fence_ctx.seqno); sched_queue_work(group->ptdev->scheduler, sync_upd); @@ -2715,6 +2762,30 @@ void panthor_sched_post_reset(struct panthor_device = *ptdev) sched_queue_work(sched, sync_upd); } =20 +static void update_fdinfo_stats(struct panthor_job *job) +{ + struct panthor_group *group =3D job->group; + struct panthor_queue *queue =3D group->queues[job->queue_idx]; + struct panthor_device *ptdev =3D group->ptdev; + struct panthor_gpu_usage *fdinfo; + struct panthor_job_times *times; + + if (drm_WARN_ON(&ptdev->base, job->ringbuf_idx >=3D queue->ringbuf.nelem)) + return; + + times =3D (struct panthor_job_times *) + ((unsigned long)group->syncobjs.bo->kmap + queue->time_offset + + (job->ringbuf_idx * sizeof(struct panthor_job_times))); + + mutex_lock(&group->fdinfo.lock); + if ((group->fdinfo.data)) { + fdinfo =3D group->fdinfo.data; + fdinfo->cycles +=3D times->cycles.after - times->cycles.before; + fdinfo->time +=3D times->time.after - times->time.before; + } + mutex_unlock(&group->fdinfo.lock); +} + static void group_sync_upd_work(struct work_struct *work) { struct panthor_group *group =3D @@ -2732,7 +2803,7 @@ static void group_sync_upd_work(struct work_struct *w= ork) if (!queue) continue; =20 - syncobj =3D group->syncobjs->kmap + (queue_idx * sizeof(*syncobj)); + syncobj =3D group->syncobjs.bo->kmap + (queue_idx * sizeof(*syncobj)); =20 spin_lock(&queue->fence_ctx.lock); list_for_each_entry_safe(job, job_tmp, &queue->fence_ctx.in_flight_jobs,= node) { @@ -2750,6 +2821,7 @@ static void group_sync_upd_work(struct work_struct *w= ork) dma_fence_end_signalling(cookie); =20 list_for_each_entry_safe(job, job_tmp, &done_jobs, node) { + update_fdinfo_stats(job); list_del_init(&job->node); panthor_job_put(&job->base); } @@ -2765,13 +2837,19 @@ queue_run_job(struct drm_sched_job *sched_job) struct panthor_queue *queue =3D group->queues[job->queue_idx]; struct panthor_device *ptdev =3D group->ptdev; struct panthor_scheduler *sched =3D ptdev->scheduler; - u32 ringbuf_size =3D panthor_kernel_bo_size(queue->ringbuf); + u32 ringbuf_size =3D panthor_kernel_bo_size(queue->ringbuf.bo); u32 ringbuf_insert =3D queue->iface.input->insert & (ringbuf_size - 1); + u32 ringbuf_index =3D ringbuf_insert / (SLOTSIZE); u64 addr_reg =3D ptdev->csif_info.cs_reg_count - ptdev->csif_info.unpreserved_cs_reg_count; u64 val_reg =3D addr_reg + 2; - u64 sync_addr =3D panthor_kernel_bo_gpuva(group->syncobjs) + - job->queue_idx * sizeof(struct panthor_syncobj_64b); + u64 cycle_reg =3D addr_reg; + u64 time_reg =3D val_reg; + u64 sync_addr =3D panthor_kernel_bo_gpuva(group->syncobjs.bo) + + job->queue_idx * sizeof(struct panthor_syncobj_64b); + u64 times_addr =3D panthor_kernel_bo_gpuva(group->syncobjs.bo) + queue->t= ime_offset + + (ringbuf_index * sizeof(struct panthor_job_times)); + u32 waitall_mask =3D GENMASK(sched->sb_slot_count - 1, 0); struct dma_fence *done_fence; int ret; @@ -2783,6 +2861,18 @@ queue_run_job(struct drm_sched_job *sched_job) /* FLUSH_CACHE2.clean_inv_all.no_wait.signal(0) rX+2 */ (36ull << 56) | (0ull << 48) | (val_reg << 40) | (0 << 16) | 0x233, =20 + /* MOV48 rX:rX+1, cycles_offset */ + (1ull << 56) | (cycle_reg << 48) | (times_addr + offsetof(struct panthor= _job_times, cycles.before)), + + /* MOV48 rX:rX+1, time_offset */ + (1ull << 56) | (time_reg << 48) | (times_addr + offsetof(struct panthor_= job_times, time.before)), + + /* STORE_STATE cycles */ + (40ull << 56) | (cycle_reg << 40) | (1ll << 32), + + /* STORE_STATE timer */ + (40ull << 56) | (time_reg << 40) | (0ll << 32), + /* MOV48 rX:rX+1, cs.start */ (1ull << 56) | (addr_reg << 48) | job->call_info.start, =20 @@ -2795,6 +2885,18 @@ queue_run_job(struct drm_sched_job *sched_job) /* CALL rX:rX+1, rX+2 */ (32ull << 56) | (addr_reg << 40) | (val_reg << 32), =20 + /* MOV48 rX:rX+1, cycles_offset */ + (1ull << 56) | (cycle_reg << 48) | (times_addr + offsetof(struct panthor= _job_times, cycles.after)), + + /* MOV48 rX:rX+1, time_offset */ + (1ull << 56) | (time_reg << 48) | (times_addr + offsetof(struct panthor_= job_times, time.after)), + + /* STORE_STATE cycles */ + (40ull << 56) | (cycle_reg << 40) | (1ll << 32), + + /* STORE_STATE timer */ + (40ull << 56) | (time_reg << 40) | (0ll << 32), + /* MOV48 rX:rX+1, sync_addr */ (1ull << 56) | (addr_reg << 48) | sync_addr, =20 @@ -2839,7 +2941,7 @@ queue_run_job(struct drm_sched_job *sched_job) queue->fence_ctx.id, atomic64_inc_return(&queue->fence_ctx.seqno)); =20 - memcpy(queue->ringbuf->kmap + ringbuf_insert, + memcpy(queue->ringbuf.bo->kmap + ringbuf_insert, call_instrs, sizeof(call_instrs)); =20 panthor_job_get(&job->base); @@ -2849,6 +2951,7 @@ queue_run_job(struct drm_sched_job *sched_job) =20 job->ringbuf.start =3D queue->iface.input->insert; job->ringbuf.end =3D job->ringbuf.start + sizeof(call_instrs); + job->ringbuf_idx =3D ringbuf_index; =20 /* Make sure the ring buffer is updated before the INSERT * register. @@ -2939,7 +3042,8 @@ static const struct drm_sched_backend_ops panthor_que= ue_sched_ops =3D { =20 static struct panthor_queue * group_create_queue(struct panthor_group *group, - const struct drm_panthor_queue_create *args) + const struct drm_panthor_queue_create *args, + unsigned int slots_so_far) { struct drm_gpu_scheduler *drm_sched; struct panthor_queue *queue; @@ -2965,21 +3069,23 @@ group_create_queue(struct panthor_group *group, =20 queue->priority =3D args->priority; =20 - queue->ringbuf =3D panthor_kernel_bo_create(group->ptdev, group->vm, + queue->ringbuf.bo =3D panthor_kernel_bo_create(group->ptdev, group->vm, args->ringbuf_size, DRM_PANTHOR_BO_NO_MMAP, DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC | DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED, PANTHOR_VM_KERNEL_AUTO_VA); - if (IS_ERR(queue->ringbuf)) { - ret =3D PTR_ERR(queue->ringbuf); + if (IS_ERR(queue->ringbuf.bo)) { + ret =3D PTR_ERR(queue->ringbuf.bo); goto err_free_queue; } =20 - ret =3D panthor_kernel_bo_vmap(queue->ringbuf); + ret =3D panthor_kernel_bo_vmap(queue->ringbuf.bo); if (ret) goto err_free_queue; =20 + queue->ringbuf.nelem =3D (args->ringbuf_size / (SLOTSIZE)); + queue->iface.mem =3D panthor_fw_alloc_queue_iface_mem(group->ptdev, &queue->iface.input, &queue->iface.output, @@ -2990,6 +3096,9 @@ group_create_queue(struct panthor_group *group, goto err_free_queue; } =20 + queue->time_offset =3D group->syncobjs.times_offset + + (slots_so_far * sizeof(struct panthor_job_times)); + ret =3D drm_sched_init(&queue->scheduler, &panthor_queue_sched_ops, group->ptdev->scheduler->wq, 1, args->ringbuf_size / (NUM_INSTRS_PER_SLOT * sizeof(u64)), @@ -3020,6 +3129,7 @@ int panthor_group_create(struct panthor_file *pfile, struct panthor_scheduler *sched =3D ptdev->scheduler; struct panthor_fw_csg_iface *csg_iface =3D panthor_fw_get_csg_iface(ptdev= , 0); struct panthor_group *group =3D NULL; + unsigned int total_slots; u32 gid, i, suspend_size; int ret; =20 @@ -3086,33 +3196,77 @@ int panthor_group_create(struct panthor_file *pfile, goto err_put_group; } =20 - group->syncobjs =3D panthor_kernel_bo_create(ptdev, group->vm, - group_args->queues.count * - sizeof(struct panthor_syncobj_64b), + /* + * Need to add size for the fdinfo sample structs, as many as the sum + * of the number of job slots for every single queue ringbuffer. + */ + + for (i =3D 0, total_slots =3D 0; i < group_args->queues.count; i++) + total_slots +=3D (queue_args[i].ringbuf_size / (SLOTSIZE)); + + /* + * Memory layout of group's syncobjs BO + * group->syncobjs.bo { + * struct panthor_syncobj_64b sync1; + * struct panthor_syncobj_64b sync2; + * ... + * As many as group_args->queues.count + * ... + * struct panthor_syncobj_64b syncn; + * struct panthor_job_times queue_1slot_1 + * struct panthor_job_times queue_1slot_2 + * ... + * As many as queue[i].ringbuf_size / SLOTSIZE + * ... + * struct panthor_job_times queue_1slot_p + * ... + * As many as group_args->queues.count + * ... + * struct panthor_job_times queue_nslot_1 + * struct panthor_job_times queue_nslot_2 + * ... + * As many as queue[n].ringbuf_size / SLOTSIZE + * struct panthor_job_times queue_nslot_q + * + * Linearly, group->syncobjs.bo =3D {syncojb1,..,syncobjN, + * {queue1 =3D {js1,..,jsp},..,queueN =3D {js1,..,jsq}}} + * } + * + */ + + group->syncobjs.bo =3D panthor_kernel_bo_create(ptdev, group->vm, + (group_args->queues.count * + sizeof(struct panthor_syncobj_64b)) + + (total_slots * sizeof(struct panthor_job_times)), DRM_PANTHOR_BO_NO_MMAP, DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC | DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED, PANTHOR_VM_KERNEL_AUTO_VA); - if (IS_ERR(group->syncobjs)) { - ret =3D PTR_ERR(group->syncobjs); + if (IS_ERR(group->syncobjs.bo)) { + ret =3D PTR_ERR(group->syncobjs.bo); goto err_put_group; } =20 - ret =3D panthor_kernel_bo_vmap(group->syncobjs); + ret =3D panthor_kernel_bo_vmap(group->syncobjs.bo); if (ret) goto err_put_group; =20 - memset(group->syncobjs->kmap, 0, - group_args->queues.count * sizeof(struct panthor_syncobj_64b)); + memset(group->syncobjs.bo->kmap, 0, + (group_args->queues.count * sizeof(struct panthor_syncobj_64b)) + + (total_slots * sizeof(struct panthor_job_times))); =20 - for (i =3D 0; i < group_args->queues.count; i++) { - group->queues[i] =3D group_create_queue(group, &queue_args[i]); + group->syncobjs.times_offset =3D + group_args->queues.count * sizeof(struct panthor_syncobj_64b); + + for (i =3D 0, total_slots =3D 0; i < group_args->queues.count; i++) { + group->queues[i] =3D group_create_queue(group, &queue_args[i], total_slo= ts); if (IS_ERR(group->queues[i])) { ret =3D PTR_ERR(group->queues[i]); group->queues[i] =3D NULL; goto err_put_group; } =20 + total_slots +=3D (queue_args[i].ringbuf_size / (SLOTSIZE)); group->queue_count++; } =20 @@ -3133,6 +3287,9 @@ int panthor_group_create(struct panthor_file *pfile, } mutex_unlock(&sched->reset.lock); =20 + group->fdinfo.data =3D &pfile->stats; + mutex_init(&group->fdinfo.lock); + return gid; =20 err_put_group: @@ -3172,6 +3329,10 @@ int panthor_group_destroy(struct panthor_file *pfile= , u32 group_handle) mutex_unlock(&sched->lock); mutex_unlock(&sched->reset.lock); =20 + mutex_lock(&group->fdinfo.lock); + group->fdinfo.data =3D NULL; + mutex_unlock(&group->fdinfo.lock); + group_put(group); return 0; } --=20 2.43.0 From nobody Fri Dec 19 21:50:27 2025 Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 824A114012 for ; Tue, 5 Mar 2024 21:11:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709673084; cv=none; b=Sudyd3p0HVF0LufesiyMFz0qY4omry/3inZgdoGsnCnTVqkgCDJqehyDDiMljXWQ12ox5u+ZjYFvPzsO073k/pQ56p1MmLBd8nDwZX1gh9yMTW/JMk1wQ6U0rUto3n7XKqlaxKF6+GN0t5pn9qPQRmhSu9hkZXtDszty5okPJSA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709673084; c=relaxed/simple; bh=42G9g4HltAvmEeN59wy+CM/K9KGre53PrTNDCbfrKwo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=E253ET5QTg17Wbb4bewokDG2lUo+IFMUtrwsNZHvp02sVFDoC7BnSTQzvpfUQfc+bNmt3AZcuW43NwIFwJC0xjzn/YhRCpO3+YNcFVwz24k6WpD8iToHaY1pe7FQGF5q5UAnHj/1BjbpHPewFt1Oq8HT6xA5jfqcSQhXUMHKt48= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=vwp6/95P; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="vwp6/95P" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1709673080; bh=42G9g4HltAvmEeN59wy+CM/K9KGre53PrTNDCbfrKwo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vwp6/95PxS/uOq5u0qY2dAzuPECbSdmy2ifM+gDUOBaHShS4DQx/zRZPT3RbzQInc U4eDVvBmoQGuaf02gDJvaEkLSMKm0YYb22HMW+5ShjUfB3t/uZSUX0jsw4TOa/7f3G VVBgL7N0BATlOmKDfdxF1YotUmxntMMVy32Yujw7l6caOaNKPolZgbRNgmf0+nXeKu QF+T1q+pK4WeV2/RO3vgEFNEydX52OuSsXSFDeK3aPtrxCvR9KMQhlPlQGDYG5U51T 5ixwZy61jhuQuXESj801hk4RsUvrxyuelm+gBkj9VmxOwI6McpupEv3dvys1BtidOC 7FLP8BANMbDaQ== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 1662B3782089; Tue, 5 Mar 2024 21:11:20 +0000 (UTC) From: =?UTF-8?q?Adri=C3=A1n=20Larumbe?= To: boris.brezillon@collabora.com, steven.price@arm.com, liviu.dudau@arm.com, maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel@collabora.com Subject: [PATCH 2/2] drm/panthor: Enable fdinfo for memory stats Date: Tue, 5 Mar 2024 21:05:50 +0000 Message-ID: <20240305211000.659103-3-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240305211000.659103-1-adrian.larumbe@collabora.com> References: <20240305211000.659103-1-adrian.larumbe@collabora.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When vm-binding an already-created BO, the entirety of its virtual size is then backed by system memory, so its RSS is always the same as its virtual size. Signed-off-by: Adri=C3=A1n Larumbe --- drivers/gpu/drm/panthor/panthor_drv.c | 1 + drivers/gpu/drm/panthor/panthor_gem.c | 12 ++++++++++++ 2 files changed, 13 insertions(+) diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/pantho= r/panthor_drv.c index fa06b9e2c6cd..a5398e161f75 100644 --- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -1363,6 +1363,7 @@ static void panthor_show_fdinfo(struct drm_printer *p= , struct drm_file *file) =20 panthor_gpu_show_fdinfo(ptdev, file->driver_priv, p); =20 + drm_show_memory_stats(p, file); } =20 static const struct file_operations panthor_drm_driver_fops =3D { diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/pantho= r/panthor_gem.c index d6483266d0c2..845724e3fd93 100644 --- a/drivers/gpu/drm/panthor/panthor_gem.c +++ b/drivers/gpu/drm/panthor/panthor_gem.c @@ -143,6 +143,17 @@ panthor_gem_prime_export(struct drm_gem_object *obj, i= nt flags) return drm_gem_prime_export(obj, flags); } =20 +static enum drm_gem_object_status panthor_gem_status(struct drm_gem_object= *obj) +{ + struct panthor_gem_object *bo =3D to_panthor_bo(obj); + enum drm_gem_object_status res =3D 0; + + if (bo->base.pages) + res |=3D DRM_GEM_OBJECT_RESIDENT; + + return res; +} + static const struct drm_gem_object_funcs panthor_gem_funcs =3D { .free =3D panthor_gem_free_object, .print_info =3D drm_gem_shmem_object_print_info, @@ -152,6 +163,7 @@ static const struct drm_gem_object_funcs panthor_gem_fu= ncs =3D { .vmap =3D drm_gem_shmem_object_vmap, .vunmap =3D drm_gem_shmem_object_vunmap, .mmap =3D panthor_gem_mmap, + .status =3D panthor_gem_status, .export =3D panthor_gem_prime_export, .vm_ops =3D &drm_gem_shmem_vm_ops, }; --=20 2.43.0