[PATCH v8] media: vsp1: Add VSPX support

Jacopo Mondi posted 1 patch 9 months, 1 week ago
There is a newer version of this series
drivers/media/platform/renesas/vsp1/Makefile    |   1 +
drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
include/media/vsp1.h                            |  75 +++
7 files changed, 770 insertions(+), 1 deletion(-)
[PATCH v8] media: vsp1: Add VSPX support
Posted by Jacopo Mondi 9 months, 1 week ago
Add support for VSPX, a specialized version of the VSP2 that
transfers data to the ISP. The VSPX is composed of two RPF units
to read data from external memory and an IIF instance that performs
transfer towards the ISP.

The VSPX is supported through a newly introduced vsp1_vspx.c file that
exposes two interfaces: vsp1_vspx interface, declared in vsp1_vspx.h
for the vsp1 core to initialize and cleanup the VSPX, and a vsp1_isp
interface, declared in include/media/vsp1.h for the ISP driver to
control the VSPX operations.

Signed-off-by: Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
---
The VSPX is a VSP2 function that reads data from external memory using
two RPF instances and feed it to the ISP.

The VSPX includes an IIF unit (ISP InterFace) modeled in the vsp1 driver
as a new, simple, entity type.

IIF is part of VSPX, a version of the VSP2 IP specialized for ISP
interfacing. To prepare to support VSPX, support IIF first by
introducing a new entity and by adjusting the RPF/WPF drivers to
operate correctly when an IIF is present.

Changes in v8:
- Remove patches already collected by Laurent in
  [GIT PULL FOR v6.16] Renesas media drivers changes

- Rebased on
  https://gitlab.freedesktop.org/linux-media/users/pinchartl.git #renesas-next

- Changes to the VSPX interface towards the ISP
  - Split start/stop_streaming
  - Add vsp1_isp_jobs_release() to release pending jobs
  - Add vsp1_isp_free_buffer()
  - Remove vsp1_isp_configure() and compute partitions on job creation

- Driver changes
  - Drop irq-driver flow
    The VSPX used to schedule new jobs as soon as processing the last
    one is done. This doesn't work well with the R-Car ISP design
    for two reasons:
    - The ISP needs per-job registers programming
    - The ISP and VSPX job queues have to stay in sync

- Minors
  - Remove the jobs_lock as a single lock is fine
  - Protect against VSPX/ISP irq races in job_run() by checking the
    VSPX 'busy' register and remove the 'processing' flag
  - Manually set the pipeline state to STOPPED before scheduling a new
    job without waiting for frame_end

Changes in v7:
- Include VSPX driver in the series
- Use existing VSP1 formats and remove patches extending formats on RPF
- Rework VSPX driver to split jobs creation and scheduling in two
  different API entry points
- Fix VSPX stride using the user provided bytesperline and using the
  buffer size for ConfigDMA buffers
- Link to v6: https://lore.kernel.org/r/20250321-v4h-iif-v6-0-361e9043026a@ideasonboard.com

Changes in v6:
- Little cosmetic change as suggested by Laurent
- Collect tags
- Link to v5: https://lore.kernel.org/r/20250319-v4h-iif-v5-0-0a10456d792c@ideasonboard.com

Changes in v5:
- Drop additional empty line 5/6
- Link to v4: https://lore.kernel.org/r/20250318-v4h-iif-v4-0-10ed4c41c195@ideasonboard.com

Changes in v4:
- Fix SWAP bits for RAW10, RAW12 and RAW16
- Link to v3: https://lore.kernel.org/r/20250317-v4h-iif-v3-0-63aab8982b50@ideasonboard.com

Changes in v3:
- Drop 2/6 from v2
- Add 5/7 to prepare for a new implementation of 6/7
- Individual changelog per patch
- Add 7/7
- Link to v2: https://lore.kernel.org/r/20250224-v4h-iif-v2-0-0305e3c1fe2d@ideasonboard.com

Changes in v2:
- Collect tags
- Address review comments from Laurent, a lot of tiny changes here and
  there but no major redesign worth an entry in the patchset changelog
---
 drivers/media/platform/renesas/vsp1/Makefile    |   1 +
 drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
 drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
 drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
 drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
 drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
 include/media/vsp1.h                            |  75 +++
 7 files changed, 770 insertions(+), 1 deletion(-)

diff --git a/drivers/media/platform/renesas/vsp1/Makefile b/drivers/media/platform/renesas/vsp1/Makefile
index de8c802e1d1a..2057c8f7be47 100644
--- a/drivers/media/platform/renesas/vsp1/Makefile
+++ b/drivers/media/platform/renesas/vsp1/Makefile
@@ -6,5 +6,6 @@ vsp1-y					+= vsp1_clu.o vsp1_hsit.o vsp1_lut.o
 vsp1-y					+= vsp1_brx.o vsp1_sru.o vsp1_uds.o
 vsp1-y					+= vsp1_hgo.o vsp1_hgt.o vsp1_histo.o
 vsp1-y					+= vsp1_iif.o vsp1_lif.o vsp1_uif.o
+vsp1-y					+= vsp1_vspx.o
 
 obj-$(CONFIG_VIDEO_RENESAS_VSP1)	+= vsp1.o
diff --git a/drivers/media/platform/renesas/vsp1/vsp1.h b/drivers/media/platform/renesas/vsp1/vsp1.h
index 263024639dd2..1872e403f26b 100644
--- a/drivers/media/platform/renesas/vsp1/vsp1.h
+++ b/drivers/media/platform/renesas/vsp1/vsp1.h
@@ -110,6 +110,7 @@ struct vsp1_device {
 	struct media_entity_operations media_ops;
 
 	struct vsp1_drm *drm;
+	struct vsp1_vspx *vspx;
 };
 
 int vsp1_device_get(struct vsp1_device *vsp1);
diff --git a/drivers/media/platform/renesas/vsp1/vsp1_drv.c b/drivers/media/platform/renesas/vsp1/vsp1_drv.c
index d13e9b31aa7c..e338ab8af292 100644
--- a/drivers/media/platform/renesas/vsp1/vsp1_drv.c
+++ b/drivers/media/platform/renesas/vsp1/vsp1_drv.c
@@ -38,6 +38,7 @@
 #include "vsp1_uds.h"
 #include "vsp1_uif.h"
 #include "vsp1_video.h"
+#include "vsp1_vspx.h"
 
 /* -----------------------------------------------------------------------------
  * Interrupt Handling
@@ -488,7 +489,10 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
 
 		ret = media_device_register(mdev);
 	} else {
-		ret = vsp1_drm_init(vsp1);
+		if (vsp1->info->version == VI6_IP_VERSION_MODEL_VSPX_GEN4)
+			ret = vsp1_vspx_init(vsp1);
+		else
+			ret = vsp1_drm_init(vsp1);
 	}
 
 done:
@@ -846,6 +850,13 @@ static const struct vsp1_device_info vsp1_device_infos[] = {
 		.uif_count = 2,
 		.wpf_count = 1,
 		.num_bru_inputs = 5,
+	}, {
+		.version = VI6_IP_VERSION_MODEL_VSPX_GEN4,
+		.model = "VSP2-X",
+		.gen = 4,
+		.features = VSP1_HAS_IIF,
+		.rpf_count = 2,
+		.wpf_count = 1,
 	},
 };
 
diff --git a/drivers/media/platform/renesas/vsp1/vsp1_regs.h b/drivers/media/platform/renesas/vsp1/vsp1_regs.h
index 86e47c2d991f..10cfbcd1b6e0 100644
--- a/drivers/media/platform/renesas/vsp1/vsp1_regs.h
+++ b/drivers/media/platform/renesas/vsp1/vsp1_regs.h
@@ -799,6 +799,7 @@
 #define VI6_IP_VERSION_MODEL_VSPDL_GEN3	(0x19 << 8)
 #define VI6_IP_VERSION_MODEL_VSPBS_GEN3	(0x1a << 8)
 #define VI6_IP_VERSION_MODEL_VSPD_GEN4	(0x1c << 8)
+#define VI6_IP_VERSION_MODEL_VSPX_GEN4	(0x1d << 8)
 /* RZ/G2L SoCs have no version register, So use 0x80 as the model version */
 #define VI6_IP_VERSION_MODEL_VSPD_RZG2L	(0x80 << 8)
 
diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
new file mode 100644
index 000000000000..6edb5e4929d9
--- /dev/null
+++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
@@ -0,0 +1,664 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * vsp1_vspx.c  --  R-Car Gen 4 VSPX
+ *
+ * Copyright (C) 2025 Ideas On Board Oy
+ * Copyright (C) 2025 Renesas Electronics Corporation
+ */
+
+#include "vsp1_vspx.h"
+
+#include <linux/cleanup.h>
+#include <linux/container_of.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include <media/media-entity.h>
+#include <media/v4l2-subdev.h>
+#include <media/vsp1.h>
+
+#include "vsp1_dl.h"
+#include "vsp1_iif.h"
+#include "vsp1_pipe.h"
+#include "vsp1_rwpf.h"
+
+/*
+ * struct vsp1_vspx_pipeline - VSPX pipeline
+ * @pipe: the VSP1 pipeline
+ * @partition: the pre-calculated partition used by the pipeline
+ * @vspx_lock: protect access to the VSPX configuration and the jobs queue
+ * @enabled: the enable flag
+ * @jobs: jobs queue
+ * @vspx_frame_end: frame end callback
+ * @frame_end_data: data for the frame end callback
+ */
+struct vsp1_vspx_pipeline {
+	struct vsp1_pipeline pipe;
+	struct vsp1_partition partition;
+
+	/* Protects the pipeline configuration */
+	spinlock_t vspx_lock;
+	bool enabled;
+
+	struct list_head jobs;
+
+	void (*vspx_frame_end)(void *frame_end_data);
+	void *frame_end_data;
+};
+
+static inline struct vsp1_vspx_pipeline *
+to_vsp1_vspx_pipeline(struct vsp1_pipeline *pipe)
+{
+	return container_of(pipe, struct vsp1_vspx_pipeline, pipe);
+}
+
+/*
+ * struct vsp1_vspx_job - VSPX transfer job
+ * @dl: display list populated by vsp1_isp_job_prepare()
+ * @job_queue: list handle
+ */
+struct vsp1_vspx_job {
+	struct vsp1_dl_list *dl;
+	struct list_head job_queue;
+};
+
+/*
+ * struct vsp1_vspx - VSPX device
+ * @vsp1: the VSP1 device
+ * @pipe: the VSPX pipeline
+ */
+struct vsp1_vspx {
+	struct vsp1_device *vsp1;
+	struct vsp1_vspx_pipeline pipe;
+};
+
+static const struct v4l2_pix_format_mplane vspx_default_fmt = {
+	.width = 1920,
+	.height = 1080,
+	.pixelformat = V4L2_PIX_FMT_SRGGB8,
+	.field = V4L2_FIELD_NONE,
+	.num_planes = 1,
+	.plane_fmt = {
+		[0] = {
+			.sizeimage = 1920 * 1080,
+			.bytesperline = 1920,
+		},
+	},
+};
+
+/*
+ * Apply the given width, height and fourcc to the subdevice inside the
+ * VSP1 entity.
+ */
+static int vsp1_vspx_rwpf_set_subdev_fmt(struct vsp1_device *vsp1,
+					 struct vsp1_rwpf *rwpf,
+					 u32 isp_fourcc,
+					 unsigned int width,
+					 unsigned int height)
+{
+	struct vsp1_entity *ent = &rwpf->entity;
+	const struct vsp1_format_info *fmtinfo;
+	struct v4l2_subdev_format format = {};
+	u32 vspx_fourcc;
+
+	switch (isp_fourcc) {
+	case V4L2_PIX_FMT_GREY:
+		/* 8 bit RAW Bayer image. */
+		vspx_fourcc = V4L2_PIX_FMT_RGB332;
+		break;
+	case V4L2_PIX_FMT_Y10:
+	case V4L2_PIX_FMT_Y12:
+	case V4L2_PIX_FMT_Y16:
+		/* 10, 12 and 16 bit RAW Bayer image. */
+		vspx_fourcc = V4L2_PIX_FMT_RGB565;
+		break;
+	case V4L2_META_FMT_GENERIC_8:
+		/* ConfigDMA parameters buffer. */
+		vspx_fourcc = V4L2_PIX_FMT_XBGR32;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	fmtinfo = vsp1_get_format_info(vsp1, vspx_fourcc);
+	rwpf->fmtinfo = fmtinfo;
+
+	format.which = V4L2_SUBDEV_FORMAT_ACTIVE;
+	format.pad = RWPF_PAD_SINK;
+	format.format.width = width;
+	format.format.height = height;
+	format.format.field = V4L2_FIELD_NONE;
+	format.format.code = fmtinfo->mbus;
+
+	return v4l2_subdev_call(&ent->subdev, pad, set_fmt, NULL, &format);
+}
+
+/* Configure RPF0 for ConfigDMA and RAW image transfer. */
+static int vsp1_vspx_rpf0_configure(struct vsp1_device *vsp1,
+				    dma_addr_t addr, u32 isp_fourcc,
+				    unsigned int width, unsigned int height,
+				    unsigned int stride,
+				    unsigned int iif_sink_pad,
+				    struct vsp1_dl_list *dl,
+				    struct vsp1_dl_body *dlb)
+{
+	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
+	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
+	struct vsp1_rwpf *rpf0 = pipe->inputs[0];
+	int ret;
+
+	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, rpf0, isp_fourcc, width,
+					    height);
+	if (ret)
+		return ret;
+
+	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output, isp_fourcc,
+					    width, height);
+	if (ret)
+		return ret;
+
+	vsp1_pipeline_calculate_partition(pipe, &pipe->part_table[0],
+					  width, 0);
+
+	rpf0->format.plane_fmt[0].bytesperline = stride;
+
+	/*
+	 * Connect RPF0 to the IIF sink pad corresponding to the config or image
+	 * path.
+	 */
+	rpf0->entity.sink_pad = iif_sink_pad;
+
+	pipe->part_table[0].rpf[0].width = width;
+	pipe->part_table[0].rpf[0].height = height;
+
+	rpf0->mem.addr[0] = addr;
+	rpf0->mem.addr[1] = 0;
+	rpf0->mem.addr[2] = 0;
+
+	vsp1_entity_route_setup(&rpf0->entity, pipe, dlb);
+	vsp1_entity_configure_stream(&rpf0->entity, rpf0->entity.state, pipe,
+				     dl, dlb);
+	vsp1_entity_configure_partition(&rpf0->entity, pipe,
+					&pipe->part_table[0], dl, dlb);
+
+	return 0;
+}
+
+/* -----------------------------------------------------------------------------
+ * Interrupt handling
+ */
+
+static void vsp1_vspx_pipeline_frame_end(struct vsp1_pipeline *pipe,
+					 unsigned int completion)
+{
+	struct vsp1_vspx_pipeline *vspx_pipe = to_vsp1_vspx_pipeline(pipe);
+
+	if (vspx_pipe->vspx_frame_end)
+		vspx_pipe->vspx_frame_end(vspx_pipe->frame_end_data);
+}
+
+/* -----------------------------------------------------------------------------
+ * ISP Driver API (include/media/vsp1.h)
+ */
+
+/**
+ * vsp1_isp_init() - Initialize the VSPX
+ *
+ * @dev: The VSP1 struct device
+ *
+ * Return: %0 on success or a negative error code on failure
+ */
+int vsp1_isp_init(struct device *dev)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+
+	if (!vsp1)
+		return -EPROBE_DEFER;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_init);
+
+/**
+ * vsp1_isp_get_bus_master - Get VSPX bus master
+ *
+ * The VSPX accesses memory through an FCPX instance. When allocating memory
+ * buffers that will have to be accessed by the VSPX the 'struct device' of
+ * the FCPX should be used. Use this function to get a reference to it.
+ *
+ * @dev: The VSP1 struct device
+ *
+ * Return: a pointer to the bus master's device
+ */
+struct device *vsp1_isp_get_bus_master(struct device *dev)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+
+	if (!vsp1)
+		return ERR_PTR(-ENODEV);
+
+	return vsp1->bus_master;
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_get_bus_master);
+
+/**
+ * vsp1_isp_alloc_buffer - Allocate a buffer in the VSPX address space
+ *
+ * Allocate a buffer that will be later accessed by the VSPX. Buffers allocated
+ * using vsp1_isp_alloc_buffer() shall be released with a call to
+ * vsp1_isp_free_buffer().
+ *
+ * @dev: The VSP1 struct device
+ * @size: The size of the buffer to be allocated by the VSPX
+ * @buffer_desc: The allocated buffer descriptor, will be filled with the
+ *		 buffer CPU-mapped address, the bus address and the allocated
+ *		 buffer size
+ *
+ * Return: %0 on success or a negative error code on failure
+ */
+int vsp1_isp_alloc_buffer(struct device *dev, size_t size,
+			  struct vsp1_isp_buffer_desc *buffer_desc)
+{
+	struct device *bus_master = vsp1_isp_get_bus_master(dev);
+
+	if (IS_ERR_OR_NULL(bus_master))
+		return -ENODEV;
+
+	buffer_desc->cpu_addr = dma_alloc_coherent(bus_master, size,
+						   &buffer_desc->dma_addr,
+						   GFP_KERNEL);
+	if (!buffer_desc->cpu_addr)
+		return -ENOMEM;
+
+	buffer_desc->size = size;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_alloc_buffer);
+
+/**
+ * vsp1_isp_free_buffer - Release a buffer allocated by vsp1_isp_alloc_buffer()
+ *
+ * Release memory in the VSPX address space allocated by
+ * vsp1_isp_alloc_buffer().
+ *
+ * @dev: The VSP1 struct device
+ * @buffer_desc: The descriptor of the buffer to release as returned by
+ *		 vsp1_isp_alloc_buffer()
+ */
+void vsp1_isp_free_buffer(struct device *dev,
+			  struct vsp1_isp_buffer_desc *buffer_desc)
+{
+	struct device *bus_master = vsp1_isp_get_bus_master(dev);
+
+	if (IS_ERR_OR_NULL(bus_master))
+		return;
+
+	dma_free_coherent(bus_master, buffer_desc->size, buffer_desc->cpu_addr,
+			  buffer_desc->dma_addr);
+}
+
+/**
+ * vsp1_isp_start_streaming - Start processing VSPX jobs
+ *
+ * Start the VSPX and prepare for accepting buffer transfer job requests.
+ *
+ * @dev: The VSP1 struct device
+ * @frame_end: The frame end callback description
+ *
+ * Return: %0 on success or a negative error code on failure
+ */
+int vsp1_isp_start_streaming(struct device *dev,
+			     struct vsp1_vspx_frame_end *frame_end)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
+	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
+	unsigned long flags;
+	u32 value;
+	int ret;
+
+	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
+
+	if (vspx_pipe->enabled) {
+		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+		return 0;
+	}
+
+	if (!frame_end) {
+		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+		return -EINVAL;
+	}
+
+	vspx_pipe->vspx_frame_end = frame_end->vspx_frame_end;
+	vspx_pipe->frame_end_data = frame_end->frame_end_data;
+
+	/* Make sure VSPX is not active. */
+	value = vsp1_read(vsp1, VI6_CMD(0));
+	if (value & VI6_CMD_STRCMD) {
+		dev_err(vsp1->dev,
+			"%s: Starting of WPF0 already reserved\n", __func__);
+		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+		return -EBUSY;
+	}
+
+	value = vsp1_read(vsp1, VI6_STATUS);
+	if (value & VI6_STATUS_SYS_ACT(0)) {
+		dev_err(vsp1->dev,
+			"%s: WPF0 has not entered idle state\n", __func__);
+		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+		return -EBUSY;
+	}
+
+	vspx_pipe->enabled = true;
+
+	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+
+	/* Enable the VSP1 and prepare for streaming. */
+	vsp1_pipeline_dump(pipe, "VSPX job");
+
+	ret = vsp1_device_get(vsp1);
+	if (ret < 0) {
+		guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
+		vspx_pipe->enabled = false;
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_start_streaming);
+
+/**
+ * vsp1_isp_stop_streaming - Stop the VSPX
+ *
+ * @dev: The VSP1 struct device
+ */
+void vsp1_isp_stop_streaming(struct device *dev)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
+	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
+	unsigned long flags;
+
+	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
+
+	if (!vspx_pipe->enabled) {
+		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+		return;
+	}
+
+	vspx_pipe->enabled = false;
+
+	pipe->state = VSP1_PIPELINE_STOPPED;
+	vsp1_pipeline_stop(pipe);
+	vspx_pipe->vspx_frame_end = NULL;
+	vsp1_dlm_reset(pipe->output->dlm);
+
+	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
+
+	vsp1_device_put(vsp1);
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_stop_streaming);
+
+/**
+ * vsp1_vspx_job_prepare - Prepare a new buffer transfer job
+ *
+ * Prepare a new buffer transfer job by populating a display list that will be
+ * later executed by a call to vsp1_isp_job_run().
+ *
+ * The newly created job is appended to the queue of pending jobs. All pending
+ * jobs must be released after stopping the streaming operations with a call
+ * to vsp1_isp_jobs_release().
+ *
+ * @dev: The VSP1 struct device
+ * @job: The job description
+ *
+ * Return: %0 on success or a negative error code on failure
+ */
+int vsp1_isp_job_prepare(struct device *dev,
+			 const struct vsp1_isp_job_desc *desc)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
+	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
+	const struct v4l2_pix_format_mplane *pix_mp;
+	struct vsp1_dl_list *second_dl = NULL;
+	struct vsp1_vspx_job *job;
+	struct vsp1_dl_body *dlb;
+	struct vsp1_dl_list *dl;
+	int ret;
+
+	/* Memory will be released when the job is consumed. */
+	job = kmalloc(sizeof(*job), GFP_KERNEL);
+	if (!job)
+		return -ENOMEM;
+
+	/*
+	 * Transfer the buffers described in the job: an optional ConfigDMA
+	 * parameters buffer and a RAW image.
+	 */
+
+	job->dl = vsp1_dl_list_get(pipe->output->dlm);
+	if (!job->dl) {
+		ret = -ENOMEM;
+		goto error_free_job;
+	}
+
+	dl = job->dl;
+	dlb = vsp1_dl_list_get_body0(dl);
+
+	/* Disable RPF1. */
+	vsp1_dl_body_write(dlb, vsp1->rpf[1]->entity.route->reg,
+			   VI6_DPR_NODE_UNUSED);
+
+	/* Configure IIF routing and enable IIF function */
+	vsp1_entity_route_setup(pipe->iif, pipe, dlb);
+	vsp1_entity_configure_stream(pipe->iif, pipe->iif->state, pipe,
+				     dl, dlb);
+
+	/* Configure WPF0 to enable RPF0 as source*/
+	vsp1_entity_route_setup(&pipe->output->entity, pipe, dlb);
+	vsp1_entity_configure_stream(&pipe->output->entity,
+				     pipe->output->entity.state, pipe,
+				     dl, dlb);
+
+	if (desc->config.pairs) {
+		/*
+		 * Configure RPF0 for config data. Transfer the number of
+		 * configuration pairs plus 2 words for the header.
+		 */
+		ret = vsp1_vspx_rpf0_configure(vsp1, desc->config.mem,
+					       V4L2_META_FMT_GENERIC_8,
+					       desc->config.pairs * 2 + 2, 1,
+					       desc->config.pairs * 2 + 2,
+					       VSPX_IIF_SINK_PAD_CONFIG,
+					       dl, dlb);
+		if (ret)
+			goto error_put_dl;
+
+		second_dl = vsp1_dl_list_get(pipe->output->dlm);
+		if (!second_dl) {
+			ret = -ENOMEM;
+			goto error_put_dl;
+		}
+
+		dl = second_dl;
+		dlb = vsp1_dl_list_get_body0(dl);
+	}
+
+	/* Configure RPF0 for RAW image transfer. */
+	pix_mp = &desc->img.fmt.fmt.pix_mp;
+	ret = vsp1_vspx_rpf0_configure(vsp1, desc->img.mem,
+				       pix_mp->pixelformat,
+				       pix_mp->width, pix_mp->height,
+				       pix_mp->plane_fmt[0].bytesperline,
+				       VSPX_IIF_SINK_PAD_IMG, dl, dlb);
+	if (ret)
+		goto error_put_dl;
+
+	if (second_dl)
+		vsp1_dl_list_add_chain(job->dl, second_dl);
+
+	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
+		list_add_tail(&job->job_queue, &vspx_pipe->jobs);
+	}
+
+	return 0;
+
+error_put_dl:
+	if (second_dl)
+		vsp1_dl_list_put(second_dl);
+	vsp1_dl_list_put(job->dl);
+error_free_job:
+	kfree(job);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_job_prepare);
+
+/**
+ * vsp1_isp_job_run - Run the first pending transfer job
+ *
+ * Extract the first available job from the pending job queue and execute the
+ * display list there contained. The ISP driver shall call this function when a
+ * new transfer of a ConfigDMA parameters buffer and RAW image is required.
+ *
+ * @dev: The VSP1 struct device
+ *
+ * Return: %0 on success or a negative error code on failure
+ */
+int vsp1_isp_job_run(struct device *dev)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
+	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
+	struct vsp1_vspx_job *job;
+	u32 value;
+
+	/* Make sure VSPX is not busy processing a frame. */
+	value = vsp1_read(vsp1, VI6_CMD(0));
+	if (value) {
+		dev_err(vsp1->dev,
+			"%s: Starting of WPF0 already reserved\n", __func__);
+		return -EBUSY;
+	}
+
+	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
+
+		job = list_first_entry_or_null(&vspx_pipe->jobs,
+					       struct vsp1_vspx_job,
+					       job_queue);
+		if (!job)
+			return -ENODEV;
+
+		list_del(&job->job_queue);
+
+		/* Apply the display list. */
+		vsp1_dl_list_commit(job->dl, 0);
+		kfree(job);
+	}
+
+	scoped_guard(spinlock_irqsave, &pipe->irqlock) {
+		/*
+		 * Operating the vsp1_pipe in singleshot mode requires to
+		 * manually set the pipeline state to stopped.
+		 */
+		pipe->state = VSP1_PIPELINE_STOPPED;
+		vsp1_pipeline_run(pipe);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_job_run);
+
+/**
+ * vsp1_isp_jobs_release - Release all pending VSPX jobs
+ *
+ * Release jobs prepared by a call to vsp1_vspx_job_prepare() and not yet
+ * processed. Pending jobs shall be released after streaming has been stopped.
+ *
+ * @dev: The VSP1 struct device
+ */
+void vsp1_isp_jobs_release(struct device *dev)
+{
+	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
+	struct vsp1_vspx_job *job, *tmp;
+
+	guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
+
+	list_for_each_entry_safe(job, tmp, &vspx_pipe->jobs, job_queue) {
+		list_del(&job->job_queue);
+		vsp1_dl_list_put(job->dl);
+		kfree(job);
+	}
+}
+EXPORT_SYMBOL_GPL(vsp1_isp_jobs_release);
+
+/* -----------------------------------------------------------------------------
+ * Initialization and cleanup
+ */
+
+int vsp1_vspx_init(struct vsp1_device *vsp1)
+{
+	struct vsp1_vspx_pipeline *vspx_pipe;
+	struct vsp1_pipeline *pipe;
+
+	vsp1->vspx = devm_kzalloc(vsp1->dev, sizeof(*vsp1->vspx), GFP_KERNEL);
+	if (!vsp1->vspx)
+		return -ENOMEM;
+
+	vsp1->vspx->vsp1 = vsp1;
+
+	vspx_pipe = &vsp1->vspx->pipe;
+	vspx_pipe->enabled = false;
+
+	pipe = &vspx_pipe->pipe;
+
+	vsp1_pipeline_init(pipe);
+
+	pipe->partitions = 1;
+	pipe->part_table = &vspx_pipe->partition;
+	pipe->interlaced = false;
+	pipe->frame_end = vsp1_vspx_pipeline_frame_end;
+
+	INIT_LIST_HEAD(&vspx_pipe->jobs);
+	spin_lock_init(&vspx_pipe->vspx_lock);
+
+	/*
+	 * Initialize RPF0 as input for VSPX and use it unconditionally for
+	 * now.
+	 */
+	pipe->inputs[0] = vsp1->rpf[0];
+	pipe->inputs[0]->entity.pipe = pipe;
+	pipe->inputs[0]->entity.sink = &vsp1->iif->entity;
+	vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->inputs[0],
+				      vspx_default_fmt.pixelformat,
+				      vspx_default_fmt.width,
+				      vspx_default_fmt.height);
+	list_add(&pipe->inputs[0]->entity.list_pipe, &pipe->entities);
+
+	pipe->iif = &vsp1->iif->entity;
+	pipe->iif->pipe = pipe;
+	pipe->iif->sink = &vsp1->wpf[0]->entity;
+	pipe->iif->sink_pad = RWPF_PAD_SINK;
+	list_add_tail(&pipe->iif->list_pipe, &pipe->entities);
+
+	pipe->output = vsp1->wpf[0];
+	pipe->output->entity.pipe = pipe;
+	vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output,
+				      vspx_default_fmt.pixelformat,
+				      vspx_default_fmt.width,
+				      vspx_default_fmt.height);
+	list_add_tail(&pipe->output->entity.list_pipe, &pipe->entities);
+
+	return 0;
+}
+
+void vsp1_vspx_cleanup(struct vsp1_device *vsp1)
+{
+	vsp1_isp_jobs_release(vsp1->dev);
+}
diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.h b/drivers/media/platform/renesas/vsp1/vsp1_vspx.h
new file mode 100644
index 000000000000..f871bf9e7dec
--- /dev/null
+++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * vsp1_vspx.h  --  R-Car Gen 4 VSPX
+ *
+ * Copyright (C) 2025 Ideas On Board Oy
+ * Copyright (C) 2025 Renesas Electronics Corporation
+ */
+#ifndef __VSP1_VSPX_H__
+#define __VSP1_VSPX_H__
+
+#include "vsp1.h"
+
+int vsp1_vspx_init(struct vsp1_device *vsp1);
+void vsp1_vspx_cleanup(struct vsp1_device *vsp1);
+
+#endif /* __VSP1_VSPX_H__ */
diff --git a/include/media/vsp1.h b/include/media/vsp1.h
index 48f4a5023d81..6843e8e85520 100644
--- a/include/media/vsp1.h
+++ b/include/media/vsp1.h
@@ -15,6 +15,10 @@
 
 struct device;
 
+/* -----------------------------------------------------------------------------
+ * VSP1 DU interface
+ */
+
 int vsp1_du_init(struct device *dev);
 
 #define VSP1_DU_STATUS_COMPLETE		BIT(0)
@@ -117,4 +121,75 @@ void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index,
 int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt);
 void vsp1_du_unmap_sg(struct device *dev, struct sg_table *sgt);
 
+/* -----------------------------------------------------------------------------
+ * VSP1 ISP interface
+ */
+
+/**
+ * struct vsp1_isp_buffer_desc - Describe a buffer allocated by VSPX
+ *
+ * @size: Byte size of the buffer allocated by VSPX
+ * @cpu_addr: CPU-mapped address of a buffer allocated by VSPX
+ * @dma_addr: bus address of a buffer allocated by VSPX
+ */
+struct vsp1_isp_buffer_desc {
+	size_t size;
+	void *cpu_addr;
+	dma_addr_t dma_addr;
+};
+
+/**
+ * struct vsp1_isp_job_desc - Describe a VSPX buffer transfer request
+ *
+ * Describe a transfer request for the VSPX to perform on behalf of the ISP.
+ * The transfer job descriptor contains an optional ConfigDMA buffer and
+ * one RAW image buffer. Set config.pairs to 0 if no ConfigDMA buffer should
+ * be transferred.
+ *
+ * @config: ConfigDMA buffer
+ * @config.pairs: number of reg-value pairs in the ConfigDMA buffer
+ * @config.mem: bus address of the ConfigDMA buffer
+ * @img: RAW image buffer
+ * @img.fmt: RAW image format
+ * @img.mem: bus address of the RAW image buffer
+ */
+struct vsp1_isp_job_desc {
+	struct {
+		unsigned int pairs;
+		dma_addr_t mem;
+	} config;
+	struct {
+		struct v4l2_format fmt;
+		dma_addr_t mem;
+	} img;
+};
+
+/**
+ * struct vsp1_vspx_frame_end - VSPX frame end callback data
+ *
+ * @vspx_frame_end: Frame end callback. Called after a transfer job has been
+ *		    completed. If the job includes both a ConfigDMA and a
+ *		    RAW image, the callback is called after both have been
+ *		    transferred
+ * @frame_end_data: Frame end callback data, passed to vspx_frame_end
+ */
+struct vsp1_vspx_frame_end {
+	void (*vspx_frame_end)(void *data);
+	void *frame_end_data;
+};
+
+int vsp1_isp_init(struct device *dev);
+struct device *vsp1_isp_get_bus_master(struct device *dev);
+int vsp1_isp_alloc_buffer(struct device *dev, size_t size,
+			  struct vsp1_isp_buffer_desc *buffer_desc);
+void vsp1_isp_free_buffer(struct device *dev,
+			  struct vsp1_isp_buffer_desc *buffer_desc);
+int vsp1_isp_start_streaming(struct device *dev,
+			     struct vsp1_vspx_frame_end *frame_end);
+void vsp1_isp_stop_streaming(struct device *dev);
+int vsp1_isp_job_prepare(struct device *dev,
+			 const struct vsp1_isp_job_desc *desc);
+int vsp1_isp_job_run(struct device *dev);
+void vsp1_isp_jobs_release(struct device *dev);
+
 #endif /* __MEDIA_VSP1_H__ */

---
base-commit: 1d41f477d6ff5f5eb0b78b37644ffac0785602c9
change-id: 20250502-b4-vspx-90c815bff6dd

Best regards,
-- 
Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
Re: [PATCH v8] media: vsp1: Add VSPX support
Posted by Laurent Pinchart 9 months, 1 week ago
Hi Jacopo,

Thank you for the patch.

On Fri, May 02, 2025 at 03:23:10PM +0200, Jacopo Mondi wrote:
> Add support for VSPX, a specialized version of the VSP2 that
> transfers data to the ISP. The VSPX is composed of two RPF units
> to read data from external memory and an IIF instance that performs
> transfer towards the ISP.
> 
> The VSPX is supported through a newly introduced vsp1_vspx.c file that
> exposes two interfaces: vsp1_vspx interface, declared in vsp1_vspx.h
> for the vsp1 core to initialize and cleanup the VSPX, and a vsp1_isp
> interface, declared in include/media/vsp1.h for the ISP driver to
> control the VSPX operations.
> 
> Signed-off-by: Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
> ---
> The VSPX is a VSP2 function that reads data from external memory using
> two RPF instances and feed it to the ISP.
> 
> The VSPX includes an IIF unit (ISP InterFace) modeled in the vsp1 driver
> as a new, simple, entity type.
> 
> IIF is part of VSPX, a version of the VSP2 IP specialized for ISP
> interfacing. To prepare to support VSPX, support IIF first by
> introducing a new entity and by adjusting the RPF/WPF drivers to
> operate correctly when an IIF is present.
> 
> Changes in v8:
> - Remove patches already collected by Laurent in
>   [GIT PULL FOR v6.16] Renesas media drivers changes
> 
> - Rebased on
>   https://gitlab.freedesktop.org/linux-media/users/pinchartl.git #renesas-next
> 
> - Changes to the VSPX interface towards the ISP
>   - Split start/stop_streaming
>   - Add vsp1_isp_jobs_release() to release pending jobs
>   - Add vsp1_isp_free_buffer()
>   - Remove vsp1_isp_configure() and compute partitions on job creation
> 
> - Driver changes
>   - Drop irq-driver flow
>     The VSPX used to schedule new jobs as soon as processing the last
>     one is done. This doesn't work well with the R-Car ISP design
>     for two reasons:
>     - The ISP needs per-job registers programming
>     - The ISP and VSPX job queues have to stay in sync
> 
> - Minors
>   - Remove the jobs_lock as a single lock is fine
>   - Protect against VSPX/ISP irq races in job_run() by checking the
>     VSPX 'busy' register and remove the 'processing' flag
>   - Manually set the pipeline state to STOPPED before scheduling a new
>     job without waiting for frame_end
> 
> Changes in v7:
> - Include VSPX driver in the series
> - Use existing VSP1 formats and remove patches extending formats on RPF
> - Rework VSPX driver to split jobs creation and scheduling in two
>   different API entry points
> - Fix VSPX stride using the user provided bytesperline and using the
>   buffer size for ConfigDMA buffers
> - Link to v6: https://lore.kernel.org/r/20250321-v4h-iif-v6-0-361e9043026a@ideasonboard.com
> 
> Changes in v6:
> - Little cosmetic change as suggested by Laurent
> - Collect tags
> - Link to v5: https://lore.kernel.org/r/20250319-v4h-iif-v5-0-0a10456d792c@ideasonboard.com
> 
> Changes in v5:
> - Drop additional empty line 5/6
> - Link to v4: https://lore.kernel.org/r/20250318-v4h-iif-v4-0-10ed4c41c195@ideasonboard.com
> 
> Changes in v4:
> - Fix SWAP bits for RAW10, RAW12 and RAW16
> - Link to v3: https://lore.kernel.org/r/20250317-v4h-iif-v3-0-63aab8982b50@ideasonboard.com
> 
> Changes in v3:
> - Drop 2/6 from v2
> - Add 5/7 to prepare for a new implementation of 6/7
> - Individual changelog per patch
> - Add 7/7
> - Link to v2: https://lore.kernel.org/r/20250224-v4h-iif-v2-0-0305e3c1fe2d@ideasonboard.com
> 
> Changes in v2:
> - Collect tags
> - Address review comments from Laurent, a lot of tiny changes here and
>   there but no major redesign worth an entry in the patchset changelog

You've come a long way since v1, I think we're getting close to a good
implementation.

> ---
>  drivers/media/platform/renesas/vsp1/Makefile    |   1 +
>  drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
>  drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
>  drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
>  drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
>  drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
>  include/media/vsp1.h                            |  75 +++
>  7 files changed, 770 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/renesas/vsp1/Makefile b/drivers/media/platform/renesas/vsp1/Makefile
> index de8c802e1d1a..2057c8f7be47 100644
> --- a/drivers/media/platform/renesas/vsp1/Makefile
> +++ b/drivers/media/platform/renesas/vsp1/Makefile
> @@ -6,5 +6,6 @@ vsp1-y					+= vsp1_clu.o vsp1_hsit.o vsp1_lut.o
>  vsp1-y					+= vsp1_brx.o vsp1_sru.o vsp1_uds.o
>  vsp1-y					+= vsp1_hgo.o vsp1_hgt.o vsp1_histo.o
>  vsp1-y					+= vsp1_iif.o vsp1_lif.o vsp1_uif.o
> +vsp1-y					+= vsp1_vspx.o
>  
>  obj-$(CONFIG_VIDEO_RENESAS_VSP1)	+= vsp1.o
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1.h b/drivers/media/platform/renesas/vsp1/vsp1.h
> index 263024639dd2..1872e403f26b 100644
> --- a/drivers/media/platform/renesas/vsp1/vsp1.h
> +++ b/drivers/media/platform/renesas/vsp1/vsp1.h
> @@ -110,6 +110,7 @@ struct vsp1_device {
>  	struct media_entity_operations media_ops;
>  
>  	struct vsp1_drm *drm;
> +	struct vsp1_vspx *vspx;
>  };
>  
>  int vsp1_device_get(struct vsp1_device *vsp1);
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_drv.c b/drivers/media/platform/renesas/vsp1/vsp1_drv.c
> index d13e9b31aa7c..e338ab8af292 100644
> --- a/drivers/media/platform/renesas/vsp1/vsp1_drv.c
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_drv.c
> @@ -38,6 +38,7 @@
>  #include "vsp1_uds.h"
>  #include "vsp1_uif.h"
>  #include "vsp1_video.h"
> +#include "vsp1_vspx.h"
>  
>  /* -----------------------------------------------------------------------------
>   * Interrupt Handling
> @@ -488,7 +489,10 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
>  
>  		ret = media_device_register(mdev);
>  	} else {
> -		ret = vsp1_drm_init(vsp1);
> +		if (vsp1->info->version == VI6_IP_VERSION_MODEL_VSPX_GEN4)
> +			ret = vsp1_vspx_init(vsp1);
> +		else
> +			ret = vsp1_drm_init(vsp1);
>  	}
>  
>  done:
> @@ -846,6 +850,13 @@ static const struct vsp1_device_info vsp1_device_infos[] = {
>  		.uif_count = 2,
>  		.wpf_count = 1,
>  		.num_bru_inputs = 5,
> +	}, {
> +		.version = VI6_IP_VERSION_MODEL_VSPX_GEN4,
> +		.model = "VSP2-X",
> +		.gen = 4,
> +		.features = VSP1_HAS_IIF,
> +		.rpf_count = 2,
> +		.wpf_count = 1,
>  	},
>  };
>  
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_regs.h b/drivers/media/platform/renesas/vsp1/vsp1_regs.h
> index 86e47c2d991f..10cfbcd1b6e0 100644
> --- a/drivers/media/platform/renesas/vsp1/vsp1_regs.h
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_regs.h
> @@ -799,6 +799,7 @@
>  #define VI6_IP_VERSION_MODEL_VSPDL_GEN3	(0x19 << 8)
>  #define VI6_IP_VERSION_MODEL_VSPBS_GEN3	(0x1a << 8)
>  #define VI6_IP_VERSION_MODEL_VSPD_GEN4	(0x1c << 8)
> +#define VI6_IP_VERSION_MODEL_VSPX_GEN4	(0x1d << 8)
>  /* RZ/G2L SoCs have no version register, So use 0x80 as the model version */
>  #define VI6_IP_VERSION_MODEL_VSPD_RZG2L	(0x80 << 8)
>  
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> new file mode 100644
> index 000000000000..6edb5e4929d9
> --- /dev/null
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> @@ -0,0 +1,664 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * vsp1_vspx.c  --  R-Car Gen 4 VSPX
> + *
> + * Copyright (C) 2025 Ideas On Board Oy
> + * Copyright (C) 2025 Renesas Electronics Corporation
> + */
> +
> +#include "vsp1_vspx.h"
> +
> +#include <linux/cleanup.h>
> +#include <linux/container_of.h>
> +#include <linux/delay.h>
> +#include <linux/device.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/list.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +#include <media/media-entity.h>
> +#include <media/v4l2-subdev.h>
> +#include <media/vsp1.h>
> +
> +#include "vsp1_dl.h"
> +#include "vsp1_iif.h"
> +#include "vsp1_pipe.h"
> +#include "vsp1_rwpf.h"
> +
> +/*
> + * struct vsp1_vspx_pipeline - VSPX pipeline
> + * @pipe: the VSP1 pipeline
> + * @partition: the pre-calculated partition used by the pipeline
> + * @vspx_lock: protect access to the VSPX configuration and the jobs queue

As there's a single lock, you could name it just "lock".

> + * @enabled: the enable flag
> + * @jobs: jobs queue
> + * @vspx_frame_end: frame end callback
> + * @frame_end_data: data for the frame end callback
> + */
> +struct vsp1_vspx_pipeline {
> +	struct vsp1_pipeline pipe;
> +	struct vsp1_partition partition;
> +
> +	/* Protects the pipeline configuration */
> +	spinlock_t vspx_lock;
> +	bool enabled;
> +
> +	struct list_head jobs;
> +
> +	void (*vspx_frame_end)(void *frame_end_data);
> +	void *frame_end_data;
> +};
> +
> +static inline struct vsp1_vspx_pipeline *
> +to_vsp1_vspx_pipeline(struct vsp1_pipeline *pipe)
> +{
> +	return container_of(pipe, struct vsp1_vspx_pipeline, pipe);
> +}
> +
> +/*
> + * struct vsp1_vspx_job - VSPX transfer job
> + * @dl: display list populated by vsp1_isp_job_prepare()
> + * @job_queue: list handle
> + */
> +struct vsp1_vspx_job {
> +	struct vsp1_dl_list *dl;
> +	struct list_head job_queue;
> +};
> +
> +/*
> + * struct vsp1_vspx - VSPX device
> + * @vsp1: the VSP1 device
> + * @pipe: the VSPX pipeline
> + */
> +struct vsp1_vspx {
> +	struct vsp1_device *vsp1;
> +	struct vsp1_vspx_pipeline pipe;
> +};
> +
> +static const struct v4l2_pix_format_mplane vspx_default_fmt = {
> +	.width = 1920,
> +	.height = 1080,
> +	.pixelformat = V4L2_PIX_FMT_SRGGB8,
> +	.field = V4L2_FIELD_NONE,
> +	.num_planes = 1,
> +	.plane_fmt = {
> +		[0] = {
> +			.sizeimage = 1920 * 1080,
> +			.bytesperline = 1920,
> +		},
> +	},
> +};
> +
> +/*
> + * Apply the given width, height and fourcc to the subdevice inside the
> + * VSP1 entity.

s/entity/RWPF/ as you only deal with RPFs and WPFs here. Or

/* Apply the given width, height and fourcc to the RWPF's subdevice */

> + */
> +static int vsp1_vspx_rwpf_set_subdev_fmt(struct vsp1_device *vsp1,
> +					 struct vsp1_rwpf *rwpf,
> +					 u32 isp_fourcc,
> +					 unsigned int width,
> +					 unsigned int height)
> +{
> +	struct vsp1_entity *ent = &rwpf->entity;
> +	const struct vsp1_format_info *fmtinfo;
> +	struct v4l2_subdev_format format = {};
> +	u32 vspx_fourcc;
> +
> +	switch (isp_fourcc) {
> +	case V4L2_PIX_FMT_GREY:
> +		/* 8 bit RAW Bayer image. */
> +		vspx_fourcc = V4L2_PIX_FMT_RGB332;
> +		break;
> +	case V4L2_PIX_FMT_Y10:
> +	case V4L2_PIX_FMT_Y12:
> +	case V4L2_PIX_FMT_Y16:
> +		/* 10, 12 and 16 bit RAW Bayer image. */
> +		vspx_fourcc = V4L2_PIX_FMT_RGB565;
> +		break;
> +	case V4L2_META_FMT_GENERIC_8:
> +		/* ConfigDMA parameters buffer. */
> +		vspx_fourcc = V4L2_PIX_FMT_XBGR32;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	fmtinfo = vsp1_get_format_info(vsp1, vspx_fourcc);
> +	rwpf->fmtinfo = fmtinfo;

	rwpf->fmtinfo = vsp1_get_format_info(vsp1, vspx_fourcc);

and drop the local variable.

> +
> +	format.which = V4L2_SUBDEV_FORMAT_ACTIVE;
> +	format.pad = RWPF_PAD_SINK;
> +	format.format.width = width;
> +	format.format.height = height;
> +	format.format.field = V4L2_FIELD_NONE;
> +	format.format.code = fmtinfo->mbus;
> +
> +	return v4l2_subdev_call(&ent->subdev, pad, set_fmt, NULL, &format);
> +}
> +
> +/* Configure RPF0 for ConfigDMA and RAW image transfer. */

s/and/or/

But you also configure the WPF, so I'd write

/* Configure the RPF -> IIF -> WPF pipeline for ConfigDMA or RAW image transfer. */

and rename the function to vsp1_vspx_pipeline_configure().

> +static int vsp1_vspx_rpf0_configure(struct vsp1_device *vsp1,
> +				    dma_addr_t addr, u32 isp_fourcc,
> +				    unsigned int width, unsigned int height,
> +				    unsigned int stride,
> +				    unsigned int iif_sink_pad,
> +				    struct vsp1_dl_list *dl,
> +				    struct vsp1_dl_body *dlb)
> +{
> +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> +	struct vsp1_rwpf *rpf0 = pipe->inputs[0];
> +	int ret;
> +
> +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, rpf0, isp_fourcc, width,
> +					    height);
> +	if (ret)
> +		return ret;
> +
> +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output, isp_fourcc,
> +					    width, height);
> +	if (ret)
> +		return ret;
> +
> +	vsp1_pipeline_calculate_partition(pipe, &pipe->part_table[0],
> +					  width, 0);
> +

You should also set rwpf->format.num_planes to 1 here.

> +	rpf0->format.plane_fmt[0].bytesperline = stride;
> +
> +	/*
> +	 * Connect RPF0 to the IIF sink pad corresponding to the config or image
> +	 * path.
> +	 */
> +	rpf0->entity.sink_pad = iif_sink_pad;
> +
> +	pipe->part_table[0].rpf[0].width = width;
> +	pipe->part_table[0].rpf[0].height = height;

Isn't this handled by vsp1_pipeline_calculate_partition() ?

> +
> +	rpf0->mem.addr[0] = addr;
> +	rpf0->mem.addr[1] = 0;
> +	rpf0->mem.addr[2] = 0;
> +
> +	vsp1_entity_route_setup(&rpf0->entity, pipe, dlb);
> +	vsp1_entity_configure_stream(&rpf0->entity, rpf0->entity.state, pipe,
> +				     dl, dlb);
> +	vsp1_entity_configure_partition(&rpf0->entity, pipe,
> +					&pipe->part_table[0], dl, dlb);
> +
> +	return 0;
> +}
> +
> +/* -----------------------------------------------------------------------------
> + * Interrupt handling
> + */
> +
> +static void vsp1_vspx_pipeline_frame_end(struct vsp1_pipeline *pipe,
> +					 unsigned int completion)
> +{
> +	struct vsp1_vspx_pipeline *vspx_pipe = to_vsp1_vspx_pipeline(pipe);
> +
> +	if (vspx_pipe->vspx_frame_end)
> +		vspx_pipe->vspx_frame_end(vspx_pipe->frame_end_data);
> +}
> +
> +/* -----------------------------------------------------------------------------
> + * ISP Driver API (include/media/vsp1.h)
> + */
> +
> +/**
> + * vsp1_isp_init() - Initialize the VSPX
> + *
> + * @dev: The VSP1 struct device
> + *
> + * Return: %0 on success or a negative error code on failure
> + */
> +int vsp1_isp_init(struct device *dev)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +
> +	if (!vsp1)
> +		return -EPROBE_DEFER;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_init);
> +
> +/**
> + * vsp1_isp_get_bus_master - Get VSPX bus master
> + *
> + * The VSPX accesses memory through an FCPX instance. When allocating memory
> + * buffers that will have to be accessed by the VSPX the 'struct device' of
> + * the FCPX should be used. Use this function to get a reference to it.

The function description should go after parameters. Same elsewhere
where applicable.

> + *
> + * @dev: The VSP1 struct device
> + *
> + * Return: a pointer to the bus master's device
> + */
> +struct device *vsp1_isp_get_bus_master(struct device *dev)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +
> +	if (!vsp1)
> +		return ERR_PTR(-ENODEV);
> +
> +	return vsp1->bus_master;
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_get_bus_master);
> +
> +/**
> + * vsp1_isp_alloc_buffer - Allocate a buffer in the VSPX address space
> + *
> + * Allocate a buffer that will be later accessed by the VSPX. Buffers allocated
> + * using vsp1_isp_alloc_buffer() shall be released with a call to
> + * vsp1_isp_free_buffer().

I would explain somewhere that this is meant for ISP configuration
parameters.

> + *
> + * @dev: The VSP1 struct device
> + * @size: The size of the buffer to be allocated by the VSPX
> + * @buffer_desc: The allocated buffer descriptor, will be filled with the
> + *		 buffer CPU-mapped address, the bus address and the allocated
> + *		 buffer size
> + *
> + * Return: %0 on success or a negative error code on failure
> + */
> +int vsp1_isp_alloc_buffer(struct device *dev, size_t size,
> +			  struct vsp1_isp_buffer_desc *buffer_desc)
> +{
> +	struct device *bus_master = vsp1_isp_get_bus_master(dev);
> +
> +	if (IS_ERR_OR_NULL(bus_master))
> +		return -ENODEV;
> +
> +	buffer_desc->cpu_addr = dma_alloc_coherent(bus_master, size,
> +						   &buffer_desc->dma_addr,
> +						   GFP_KERNEL);
> +	if (!buffer_desc->cpu_addr)
> +		return -ENOMEM;
> +
> +	buffer_desc->size = size;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_alloc_buffer);
> +
> +/**
> + * vsp1_isp_free_buffer - Release a buffer allocated by vsp1_isp_alloc_buffer()
> + *
> + * Release memory in the VSPX address space allocated by
> + * vsp1_isp_alloc_buffer().
> + *
> + * @dev: The VSP1 struct device
> + * @buffer_desc: The descriptor of the buffer to release as returned by
> + *		 vsp1_isp_alloc_buffer()
> + */
> +void vsp1_isp_free_buffer(struct device *dev,
> +			  struct vsp1_isp_buffer_desc *buffer_desc)
> +{
> +	struct device *bus_master = vsp1_isp_get_bus_master(dev);
> +
> +	if (IS_ERR_OR_NULL(bus_master))
> +		return;
> +
> +	dma_free_coherent(bus_master, buffer_desc->size, buffer_desc->cpu_addr,
> +			  buffer_desc->dma_addr);
> +}
> +
> +/**
> + * vsp1_isp_start_streaming - Start processing VSPX jobs
> + *
> + * Start the VSPX and prepare for accepting buffer transfer job requests.
> + *
> + * @dev: The VSP1 struct device
> + * @frame_end: The frame end callback description
> + *
> + * Return: %0 on success or a negative error code on failure
> + */
> +int vsp1_isp_start_streaming(struct device *dev,
> +			     struct vsp1_vspx_frame_end *frame_end)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> +	unsigned long flags;
> +	u32 value;
> +	int ret;
> +
> +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);

Can this function ever be called with interrupts disabled ? If no, you
can use spin_lock_irq().

> +
> +	if (vspx_pipe->enabled) {
> +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> +		return 0;

Shouldn't this be an error ?

> +	}
> +
> +	if (!frame_end) {

You can move this check before taking the lock.

> +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> +		return -EINVAL;
> +	}
> +
> +	vspx_pipe->vspx_frame_end = frame_end->vspx_frame_end;
> +	vspx_pipe->frame_end_data = frame_end->frame_end_data;

Move this just before setting ->enabled to true, so you won't override
the values if the function returns an error in the checks below.

> +
> +	/* Make sure VSPX is not active. */

This should never happen unless there's a serious bug in the driver,
right ? I would make that explicit in the comment:

	/*
	 * Make sure VSPX is not active. This should never happen in normal
	 * usage.
	 */

> +	value = vsp1_read(vsp1, VI6_CMD(0));
> +	if (value & VI6_CMD_STRCMD) {
> +		dev_err(vsp1->dev,
> +			"%s: Starting of WPF0 already reserved\n", __func__);
> +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> +		return -EBUSY;
> +	}
> +
> +	value = vsp1_read(vsp1, VI6_STATUS);
> +	if (value & VI6_STATUS_SYS_ACT(0)) {
> +		dev_err(vsp1->dev,
> +			"%s: WPF0 has not entered idle state\n", __func__);
> +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> +		return -EBUSY;
> +	}
> +
> +	vspx_pipe->enabled = true;
> +
> +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);

and close the guarded scope here

	}

> +
> +	/* Enable the VSP1 and prepare for streaming. */
> +	vsp1_pipeline_dump(pipe, "VSPX job");
> +
> +	ret = vsp1_device_get(vsp1);

This should be done before you read the registers.

> +	if (ret < 0) {
> +		guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
> +		vspx_pipe->enabled = false;
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_start_streaming);
> +
> +/**
> + * vsp1_isp_stop_streaming - Stop the VSPX
> + *
> + * @dev: The VSP1 struct device
> + */
> +void vsp1_isp_stop_streaming(struct device *dev)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> +
> +	if (!vspx_pipe->enabled) {
> +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> +		return;
> +	}

	scoped_guard(spinlock_irqsave)(&vspx_pipe->vspx_lock, flags) {
		if (!vspx_pipe->enabled)
			return;

		...

(or spinlock_irq, see comment in vsp1_isp_start_streaming()).

> +
> +	vspx_pipe->enabled = false;
> +
> +	pipe->state = VSP1_PIPELINE_STOPPED;
> +	vsp1_pipeline_stop(pipe);

You can't call this with a spin lock held. We can look together at how
to handle locking in this driver, and use a mutex instead of a spinlock
(I know you went the other way around).

> +	vspx_pipe->vspx_frame_end = NULL;
> +	vsp1_dlm_reset(pipe->output->dlm);
> +
> +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);

	}

> +
> +	vsp1_device_put(vsp1);
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_stop_streaming);
> +
> +/**
> + * vsp1_vspx_job_prepare - Prepare a new buffer transfer job
> + *
> + * Prepare a new buffer transfer job by populating a display list that will be
> + * later executed by a call to vsp1_isp_job_run().
> + *
> + * The newly created job is appended to the queue of pending jobs. All pending
> + * jobs must be released after stopping the streaming operations with a call
> + * to vsp1_isp_jobs_release().
> + *
> + * @dev: The VSP1 struct device
> + * @job: The job description
> + *
> + * Return: %0 on success or a negative error code on failure
> + */
> +int vsp1_isp_job_prepare(struct device *dev,
> +			 const struct vsp1_isp_job_desc *desc)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> +	const struct v4l2_pix_format_mplane *pix_mp;
> +	struct vsp1_dl_list *second_dl = NULL;
> +	struct vsp1_vspx_job *job;
> +	struct vsp1_dl_body *dlb;
> +	struct vsp1_dl_list *dl;
> +	int ret;
> +
> +	/* Memory will be released when the job is consumed. */
> +	job = kmalloc(sizeof(*job), GFP_KERNEL);
> +	if (!job)
> +		return -ENOMEM;

Hmmm... I think this can be improved. I would need to see how the ISP
driver uses this API, but I'm thinking the job data could be stored in
vsp1_isp_job_desc, without adding the job to a queue in this function.
The job would then be passed to the run function, which would queue it.
I think the resulting API would be cleaner. You will need an extra
function to delete/destroy/cleanup a job that hasn't been queued (the
counterpart of this function, essentially).

> +
> +	/*
> +	 * Transfer the buffers described in the job: an optional ConfigDMA
> +	 * parameters buffer and a RAW image.
> +	 */
> +
> +	job->dl = vsp1_dl_list_get(pipe->output->dlm);
> +	if (!job->dl) {
> +		ret = -ENOMEM;
> +		goto error_free_job;
> +	}
> +
> +	dl = job->dl;
> +	dlb = vsp1_dl_list_get_body0(dl);
> +
> +	/* Disable RPF1. */
> +	vsp1_dl_body_write(dlb, vsp1->rpf[1]->entity.route->reg,
> +			   VI6_DPR_NODE_UNUSED);

The route is disabled in vsp1_device_init(), and we never use RPF1. I
think this can be dropped.

> +
> +	/* Configure IIF routing and enable IIF function */

s/function/function./

> +	vsp1_entity_route_setup(pipe->iif, pipe, dlb);
> +	vsp1_entity_configure_stream(pipe->iif, pipe->iif->state, pipe,
> +				     dl, dlb);
> +
> +	/* Configure WPF0 to enable RPF0 as source*/

s/source/source. /

> +	vsp1_entity_route_setup(&pipe->output->entity, pipe, dlb);
> +	vsp1_entity_configure_stream(&pipe->output->entity,
> +				     pipe->output->entity.state, pipe,
> +				     dl, dlb);

I'm kind of tempting to call route_setup, configure_stream,
configure_frame and configure_partition by iterating over the entities
in the pipe, as done by vsp1_du_pipeline_configure(). The code would be
more generic, at the cost of a few calls being duplicated between the
config DMA and image configurations. What do you think ?

> +
> +	if (desc->config.pairs) {
> +		/*
> +		 * Configure RPF0 for config data. Transfer the number of
> +		 * configuration pairs plus 2 words for the header.
> +		 */
> +		ret = vsp1_vspx_rpf0_configure(vsp1, desc->config.mem,
> +					       V4L2_META_FMT_GENERIC_8,
> +					       desc->config.pairs * 2 + 2, 1,
> +					       desc->config.pairs * 2 + 2,
> +					       VSPX_IIF_SINK_PAD_CONFIG,
> +					       dl, dlb);
> +		if (ret)
> +			goto error_put_dl;
> +
> +		second_dl = vsp1_dl_list_get(pipe->output->dlm);
> +		if (!second_dl) {
> +			ret = -ENOMEM;
> +			goto error_put_dl;
> +		}
> +
> +		dl = second_dl;
> +		dlb = vsp1_dl_list_get_body0(dl);
> +	}
> +
> +	/* Configure RPF0 for RAW image transfer. */
> +	pix_mp = &desc->img.fmt.fmt.pix_mp;
> +	ret = vsp1_vspx_rpf0_configure(vsp1, desc->img.mem,
> +				       pix_mp->pixelformat,
> +				       pix_mp->width, pix_mp->height,
> +				       pix_mp->plane_fmt[0].bytesperline,
> +				       VSPX_IIF_SINK_PAD_IMG, dl, dlb);
> +	if (ret)
> +		goto error_put_dl;
> +
> +	if (second_dl)
> +		vsp1_dl_list_add_chain(job->dl, second_dl);
> +
> +	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
> +		list_add_tail(&job->job_queue, &vspx_pipe->jobs);
> +	}
> +
> +	return 0;
> +
> +error_put_dl:
> +	if (second_dl)
> +		vsp1_dl_list_put(second_dl);
> +	vsp1_dl_list_put(job->dl);
> +error_free_job:
> +	kfree(job);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_job_prepare);
> +
> +/**
> + * vsp1_isp_job_run - Run the first pending transfer job
> + *
> + * Extract the first available job from the pending job queue and execute the
> + * display list there contained. The ISP driver shall call this function when a
> + * new transfer of a ConfigDMA parameters buffer and RAW image is required.
> + *
> + * @dev: The VSP1 struct device
> + *
> + * Return: %0 on success or a negative error code on failure
> + */
> +int vsp1_isp_job_run(struct device *dev)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> +	struct vsp1_vspx_job *job;
> +	u32 value;
> +
> +	/* Make sure VSPX is not busy processing a frame. */
> +	value = vsp1_read(vsp1, VI6_CMD(0));
> +	if (value) {
> +		dev_err(vsp1->dev,
> +			"%s: Starting of WPF0 already reserved\n", __func__);
> +		return -EBUSY;
> +	}
> +
> +	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
> +
> +		job = list_first_entry_or_null(&vspx_pipe->jobs,
> +					       struct vsp1_vspx_job,
> +					       job_queue);
> +		if (!job)
> +			return -ENODEV;
> +
> +		list_del(&job->job_queue);
> +
> +		/* Apply the display list. */
> +		vsp1_dl_list_commit(job->dl, 0);
> +		kfree(job);
> +	}
> +
> +	scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> +		/*
> +		 * Operating the vsp1_pipe in singleshot mode requires to
> +		 * manually set the pipeline state to stopped.
> +		 */
> +		pipe->state = VSP1_PIPELINE_STOPPED;
> +		vsp1_pipeline_run(pipe);
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_job_run);
> +
> +/**
> + * vsp1_isp_jobs_release - Release all pending VSPX jobs
> + *
> + * Release jobs prepared by a call to vsp1_vspx_job_prepare() and not yet
> + * processed. Pending jobs shall be released after streaming has been stopped.
> + *
> + * @dev: The VSP1 struct device
> + */
> +void vsp1_isp_jobs_release(struct device *dev)
> +{
> +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> +	struct vsp1_vspx_job *job, *tmp;
> +
> +	guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
> +
> +	list_for_each_entry_safe(job, tmp, &vspx_pipe->jobs, job_queue) {
> +		list_del(&job->job_queue);
> +		vsp1_dl_list_put(job->dl);
> +		kfree(job);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(vsp1_isp_jobs_release);
> +
> +/* -----------------------------------------------------------------------------
> + * Initialization and cleanup
> + */
> +
> +int vsp1_vspx_init(struct vsp1_device *vsp1)
> +{
> +	struct vsp1_vspx_pipeline *vspx_pipe;
> +	struct vsp1_pipeline *pipe;
> +
> +	vsp1->vspx = devm_kzalloc(vsp1->dev, sizeof(*vsp1->vspx), GFP_KERNEL);
> +	if (!vsp1->vspx)
> +		return -ENOMEM;
> +
> +	vsp1->vspx->vsp1 = vsp1;
> +
> +	vspx_pipe = &vsp1->vspx->pipe;
> +	vspx_pipe->enabled = false;
> +
> +	pipe = &vspx_pipe->pipe;
> +
> +	vsp1_pipeline_init(pipe);
> +
> +	pipe->partitions = 1;
> +	pipe->part_table = &vspx_pipe->partition;
> +	pipe->interlaced = false;
> +	pipe->frame_end = vsp1_vspx_pipeline_frame_end;
> +
> +	INIT_LIST_HEAD(&vspx_pipe->jobs);
> +	spin_lock_init(&vspx_pipe->vspx_lock);
> +
> +	/*
> +	 * Initialize RPF0 as input for VSPX and use it unconditionally for
> +	 * now.
> +	 */
> +	pipe->inputs[0] = vsp1->rpf[0];
> +	pipe->inputs[0]->entity.pipe = pipe;
> +	pipe->inputs[0]->entity.sink = &vsp1->iif->entity;
> +	vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->inputs[0],
> +				      vspx_default_fmt.pixelformat,
> +				      vspx_default_fmt.width,
> +				      vspx_default_fmt.height);

Is there an actual need for this default configuration, given that you
always reprogram the RPF and WPF when preparing a job ? I think you
could just drop the call, and the vspx_default_fmt structure.

> +	list_add(&pipe->inputs[0]->entity.list_pipe, &pipe->entities);

s/list_add/list_add_tail/

It won't make any functional difference, but will align the code with
the calls below.

> +
> +	pipe->iif = &vsp1->iif->entity;
> +	pipe->iif->pipe = pipe;
> +	pipe->iif->sink = &vsp1->wpf[0]->entity;
> +	pipe->iif->sink_pad = RWPF_PAD_SINK;
> +	list_add_tail(&pipe->iif->list_pipe, &pipe->entities);
> +
> +	pipe->output = vsp1->wpf[0];
> +	pipe->output->entity.pipe = pipe;
> +	vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output,
> +				      vspx_default_fmt.pixelformat,
> +				      vspx_default_fmt.width,
> +				      vspx_default_fmt.height);

Same here.

> +	list_add_tail(&pipe->output->entity.list_pipe, &pipe->entities);
> +
> +	return 0;
> +}
> +
> +void vsp1_vspx_cleanup(struct vsp1_device *vsp1)
> +{
> +	vsp1_isp_jobs_release(vsp1->dev);
> +}
> diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.h b/drivers/media/platform/renesas/vsp1/vsp1_vspx.h
> new file mode 100644
> index 000000000000..f871bf9e7dec
> --- /dev/null
> +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +/*
> + * vsp1_vspx.h  --  R-Car Gen 4 VSPX
> + *
> + * Copyright (C) 2025 Ideas On Board Oy
> + * Copyright (C) 2025 Renesas Electronics Corporation
> + */
> +#ifndef __VSP1_VSPX_H__
> +#define __VSP1_VSPX_H__
> +
> +#include "vsp1.h"
> +
> +int vsp1_vspx_init(struct vsp1_device *vsp1);
> +void vsp1_vspx_cleanup(struct vsp1_device *vsp1);
> +
> +#endif /* __VSP1_VSPX_H__ */
> diff --git a/include/media/vsp1.h b/include/media/vsp1.h
> index 48f4a5023d81..6843e8e85520 100644
> --- a/include/media/vsp1.h
> +++ b/include/media/vsp1.h
> @@ -15,6 +15,10 @@
>  
>  struct device;
>  
> +/* -----------------------------------------------------------------------------
> + * VSP1 DU interface
> + */
> +
>  int vsp1_du_init(struct device *dev);
>  
>  #define VSP1_DU_STATUS_COMPLETE		BIT(0)
> @@ -117,4 +121,75 @@ void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index,
>  int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt);
>  void vsp1_du_unmap_sg(struct device *dev, struct sg_table *sgt);
>  
> +/* -----------------------------------------------------------------------------
> + * VSP1 ISP interface
> + */
> +
> +/**
> + * struct vsp1_isp_buffer_desc - Describe a buffer allocated by VSPX
> + *
> + * @size: Byte size of the buffer allocated by VSPX
> + * @cpu_addr: CPU-mapped address of a buffer allocated by VSPX
> + * @dma_addr: bus address of a buffer allocated by VSPX
> + */
> +struct vsp1_isp_buffer_desc {
> +	size_t size;
> +	void *cpu_addr;
> +	dma_addr_t dma_addr;
> +};
> +
> +/**
> + * struct vsp1_isp_job_desc - Describe a VSPX buffer transfer request
> + *
> + * Describe a transfer request for the VSPX to perform on behalf of the ISP.
> + * The transfer job descriptor contains an optional ConfigDMA buffer and
> + * one RAW image buffer. Set config.pairs to 0 if no ConfigDMA buffer should
> + * be transferred.
> + *
> + * @config: ConfigDMA buffer
> + * @config.pairs: number of reg-value pairs in the ConfigDMA buffer
> + * @config.mem: bus address of the ConfigDMA buffer
> + * @img: RAW image buffer
> + * @img.fmt: RAW image format
> + * @img.mem: bus address of the RAW image buffer
> + */
> +struct vsp1_isp_job_desc {
> +	struct {
> +		unsigned int pairs;
> +		dma_addr_t mem;
> +	} config;
> +	struct {
> +		struct v4l2_format fmt;
> +		dma_addr_t mem;
> +	} img;
> +};
> +
> +/**
> + * struct vsp1_vspx_frame_end - VSPX frame end callback data
> + *
> + * @vspx_frame_end: Frame end callback. Called after a transfer job has been
> + *		    completed. If the job includes both a ConfigDMA and a
> + *		    RAW image, the callback is called after both have been
> + *		    transferred
> + * @frame_end_data: Frame end callback data, passed to vspx_frame_end
> + */
> +struct vsp1_vspx_frame_end {
> +	void (*vspx_frame_end)(void *data);
> +	void *frame_end_data;
> +};
> +
> +int vsp1_isp_init(struct device *dev);
> +struct device *vsp1_isp_get_bus_master(struct device *dev);
> +int vsp1_isp_alloc_buffer(struct device *dev, size_t size,
> +			  struct vsp1_isp_buffer_desc *buffer_desc);
> +void vsp1_isp_free_buffer(struct device *dev,
> +			  struct vsp1_isp_buffer_desc *buffer_desc);
> +int vsp1_isp_start_streaming(struct device *dev,
> +			     struct vsp1_vspx_frame_end *frame_end);
> +void vsp1_isp_stop_streaming(struct device *dev);
> +int vsp1_isp_job_prepare(struct device *dev,
> +			 const struct vsp1_isp_job_desc *desc);
> +int vsp1_isp_job_run(struct device *dev);
> +void vsp1_isp_jobs_release(struct device *dev);
> +
>  #endif /* __MEDIA_VSP1_H__ */
> 
> ---
> base-commit: 1d41f477d6ff5f5eb0b78b37644ffac0785602c9
> change-id: 20250502-b4-vspx-90c815bff6dd

-- 
Regards,

Laurent Pinchart
Re: [PATCH v8] media: vsp1: Add VSPX support
Posted by Jacopo Mondi 9 months, 1 week ago
Hi Laurent

On Fri, May 02, 2025 at 11:26:44PM +0300, Laurent Pinchart wrote:
> Hi Jacopo,
>
> Thank you for the patch.
>
> On Fri, May 02, 2025 at 03:23:10PM +0200, Jacopo Mondi wrote:
> > Add support for VSPX, a specialized version of the VSP2 that
> > transfers data to the ISP. The VSPX is composed of two RPF units
> > to read data from external memory and an IIF instance that performs
> > transfer towards the ISP.
> >
> > The VSPX is supported through a newly introduced vsp1_vspx.c file that
> > exposes two interfaces: vsp1_vspx interface, declared in vsp1_vspx.h
> > for the vsp1 core to initialize and cleanup the VSPX, and a vsp1_isp
> > interface, declared in include/media/vsp1.h for the ISP driver to
> > control the VSPX operations.
> >
> > Signed-off-by: Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
> > ---
> > The VSPX is a VSP2 function that reads data from external memory using
> > two RPF instances and feed it to the ISP.
> >
> > The VSPX includes an IIF unit (ISP InterFace) modeled in the vsp1 driver
> > as a new, simple, entity type.
> >
> > IIF is part of VSPX, a version of the VSP2 IP specialized for ISP
> > interfacing. To prepare to support VSPX, support IIF first by
> > introducing a new entity and by adjusting the RPF/WPF drivers to
> > operate correctly when an IIF is present.
> >
> > Changes in v8:
> > - Remove patches already collected by Laurent in
> >   [GIT PULL FOR v6.16] Renesas media drivers changes
> >
> > - Rebased on
> >   https://gitlab.freedesktop.org/linux-media/users/pinchartl.git #renesas-next
> >
> > - Changes to the VSPX interface towards the ISP
> >   - Split start/stop_streaming
> >   - Add vsp1_isp_jobs_release() to release pending jobs
> >   - Add vsp1_isp_free_buffer()
> >   - Remove vsp1_isp_configure() and compute partitions on job creation
> >
> > - Driver changes
> >   - Drop irq-driver flow
> >     The VSPX used to schedule new jobs as soon as processing the last
> >     one is done. This doesn't work well with the R-Car ISP design
> >     for two reasons:
> >     - The ISP needs per-job registers programming
> >     - The ISP and VSPX job queues have to stay in sync
> >
> > - Minors
> >   - Remove the jobs_lock as a single lock is fine
> >   - Protect against VSPX/ISP irq races in job_run() by checking the
> >     VSPX 'busy' register and remove the 'processing' flag
> >   - Manually set the pipeline state to STOPPED before scheduling a new
> >     job without waiting for frame_end
> >
> > Changes in v7:
> > - Include VSPX driver in the series
> > - Use existing VSP1 formats and remove patches extending formats on RPF
> > - Rework VSPX driver to split jobs creation and scheduling in two
> >   different API entry points
> > - Fix VSPX stride using the user provided bytesperline and using the
> >   buffer size for ConfigDMA buffers
> > - Link to v6: https://lore.kernel.org/r/20250321-v4h-iif-v6-0-361e9043026a@ideasonboard.com
> >
> > Changes in v6:
> > - Little cosmetic change as suggested by Laurent
> > - Collect tags
> > - Link to v5: https://lore.kernel.org/r/20250319-v4h-iif-v5-0-0a10456d792c@ideasonboard.com
> >
> > Changes in v5:
> > - Drop additional empty line 5/6
> > - Link to v4: https://lore.kernel.org/r/20250318-v4h-iif-v4-0-10ed4c41c195@ideasonboard.com
> >
> > Changes in v4:
> > - Fix SWAP bits for RAW10, RAW12 and RAW16
> > - Link to v3: https://lore.kernel.org/r/20250317-v4h-iif-v3-0-63aab8982b50@ideasonboard.com
> >
> > Changes in v3:
> > - Drop 2/6 from v2
> > - Add 5/7 to prepare for a new implementation of 6/7
> > - Individual changelog per patch
> > - Add 7/7
> > - Link to v2: https://lore.kernel.org/r/20250224-v4h-iif-v2-0-0305e3c1fe2d@ideasonboard.com
> >
> > Changes in v2:
> > - Collect tags
> > - Address review comments from Laurent, a lot of tiny changes here and
> >   there but no major redesign worth an entry in the patchset changelog
>
> You've come a long way since v1, I think we're getting close to a good
> implementation.
>

Thank you and Niklas for reviews and testing

> > ---
> >  drivers/media/platform/renesas/vsp1/Makefile    |   1 +
> >  drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
> >  drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
> >  drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
> >  drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
> >  drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
> >  include/media/vsp1.h                            |  75 +++
> >  7 files changed, 770 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/media/platform/renesas/vsp1/Makefile b/drivers/media/platform/renesas/vsp1/Makefile
> > index de8c802e1d1a..2057c8f7be47 100644
> > --- a/drivers/media/platform/renesas/vsp1/Makefile
> > +++ b/drivers/media/platform/renesas/vsp1/Makefile
> > @@ -6,5 +6,6 @@ vsp1-y					+= vsp1_clu.o vsp1_hsit.o vsp1_lut.o
> >  vsp1-y					+= vsp1_brx.o vsp1_sru.o vsp1_uds.o
> >  vsp1-y					+= vsp1_hgo.o vsp1_hgt.o vsp1_histo.o
> >  vsp1-y					+= vsp1_iif.o vsp1_lif.o vsp1_uif.o
> > +vsp1-y					+= vsp1_vspx.o
> >
> >  obj-$(CONFIG_VIDEO_RENESAS_VSP1)	+= vsp1.o
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1.h b/drivers/media/platform/renesas/vsp1/vsp1.h
> > index 263024639dd2..1872e403f26b 100644
> > --- a/drivers/media/platform/renesas/vsp1/vsp1.h
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1.h
> > @@ -110,6 +110,7 @@ struct vsp1_device {
> >  	struct media_entity_operations media_ops;
> >
> >  	struct vsp1_drm *drm;
> > +	struct vsp1_vspx *vspx;
> >  };
> >
> >  int vsp1_device_get(struct vsp1_device *vsp1);
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_drv.c b/drivers/media/platform/renesas/vsp1/vsp1_drv.c
> > index d13e9b31aa7c..e338ab8af292 100644
> > --- a/drivers/media/platform/renesas/vsp1/vsp1_drv.c
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_drv.c
> > @@ -38,6 +38,7 @@
> >  #include "vsp1_uds.h"
> >  #include "vsp1_uif.h"
> >  #include "vsp1_video.h"
> > +#include "vsp1_vspx.h"
> >
> >  /* -----------------------------------------------------------------------------
> >   * Interrupt Handling
> > @@ -488,7 +489,10 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
> >
> >  		ret = media_device_register(mdev);
> >  	} else {
> > -		ret = vsp1_drm_init(vsp1);
> > +		if (vsp1->info->version == VI6_IP_VERSION_MODEL_VSPX_GEN4)
> > +			ret = vsp1_vspx_init(vsp1);
> > +		else
> > +			ret = vsp1_drm_init(vsp1);
> >  	}
> >
> >  done:
> > @@ -846,6 +850,13 @@ static const struct vsp1_device_info vsp1_device_infos[] = {
> >  		.uif_count = 2,
> >  		.wpf_count = 1,
> >  		.num_bru_inputs = 5,
> > +	}, {
> > +		.version = VI6_IP_VERSION_MODEL_VSPX_GEN4,
> > +		.model = "VSP2-X",
> > +		.gen = 4,
> > +		.features = VSP1_HAS_IIF,
> > +		.rpf_count = 2,
> > +		.wpf_count = 1,
> >  	},
> >  };
> >
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_regs.h b/drivers/media/platform/renesas/vsp1/vsp1_regs.h
> > index 86e47c2d991f..10cfbcd1b6e0 100644
> > --- a/drivers/media/platform/renesas/vsp1/vsp1_regs.h
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_regs.h
> > @@ -799,6 +799,7 @@
> >  #define VI6_IP_VERSION_MODEL_VSPDL_GEN3	(0x19 << 8)
> >  #define VI6_IP_VERSION_MODEL_VSPBS_GEN3	(0x1a << 8)
> >  #define VI6_IP_VERSION_MODEL_VSPD_GEN4	(0x1c << 8)
> > +#define VI6_IP_VERSION_MODEL_VSPX_GEN4	(0x1d << 8)
> >  /* RZ/G2L SoCs have no version register, So use 0x80 as the model version */
> >  #define VI6_IP_VERSION_MODEL_VSPD_RZG2L	(0x80 << 8)
> >
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > new file mode 100644
> > index 000000000000..6edb5e4929d9
> > --- /dev/null
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > @@ -0,0 +1,664 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +/*
> > + * vsp1_vspx.c  --  R-Car Gen 4 VSPX
> > + *
> > + * Copyright (C) 2025 Ideas On Board Oy
> > + * Copyright (C) 2025 Renesas Electronics Corporation
> > + */
> > +
> > +#include "vsp1_vspx.h"
> > +
> > +#include <linux/cleanup.h>
> > +#include <linux/container_of.h>
> > +#include <linux/delay.h>
> > +#include <linux/device.h>
> > +#include <linux/dma-mapping.h>
> > +#include <linux/list.h>
> > +#include <linux/slab.h>
> > +#include <linux/spinlock.h>
> > +
> > +#include <media/media-entity.h>
> > +#include <media/v4l2-subdev.h>
> > +#include <media/vsp1.h>
> > +
> > +#include "vsp1_dl.h"
> > +#include "vsp1_iif.h"
> > +#include "vsp1_pipe.h"
> > +#include "vsp1_rwpf.h"
> > +
> > +/*
> > + * struct vsp1_vspx_pipeline - VSPX pipeline
> > + * @pipe: the VSP1 pipeline
> > + * @partition: the pre-calculated partition used by the pipeline
> > + * @vspx_lock: protect access to the VSPX configuration and the jobs queue
>
> As there's a single lock, you could name it just "lock".
>

it can indeed

> > + * @enabled: the enable flag
> > + * @jobs: jobs queue
> > + * @vspx_frame_end: frame end callback
> > + * @frame_end_data: data for the frame end callback
> > + */
> > +struct vsp1_vspx_pipeline {
> > +	struct vsp1_pipeline pipe;
> > +	struct vsp1_partition partition;
> > +
> > +	/* Protects the pipeline configuration */
> > +	spinlock_t vspx_lock;
> > +	bool enabled;
> > +
> > +	struct list_head jobs;
> > +
> > +	void (*vspx_frame_end)(void *frame_end_data);
> > +	void *frame_end_data;
> > +};
> > +
> > +static inline struct vsp1_vspx_pipeline *
> > +to_vsp1_vspx_pipeline(struct vsp1_pipeline *pipe)
> > +{
> > +	return container_of(pipe, struct vsp1_vspx_pipeline, pipe);
> > +}
> > +
> > +/*
> > + * struct vsp1_vspx_job - VSPX transfer job
> > + * @dl: display list populated by vsp1_isp_job_prepare()
> > + * @job_queue: list handle
> > + */
> > +struct vsp1_vspx_job {
> > +	struct vsp1_dl_list *dl;
> > +	struct list_head job_queue;
> > +};
> > +
> > +/*
> > + * struct vsp1_vspx - VSPX device
> > + * @vsp1: the VSP1 device
> > + * @pipe: the VSPX pipeline
> > + */
> > +struct vsp1_vspx {
> > +	struct vsp1_device *vsp1;
> > +	struct vsp1_vspx_pipeline pipe;
> > +};
> > +
> > +static const struct v4l2_pix_format_mplane vspx_default_fmt = {
> > +	.width = 1920,
> > +	.height = 1080,
> > +	.pixelformat = V4L2_PIX_FMT_SRGGB8,
> > +	.field = V4L2_FIELD_NONE,
> > +	.num_planes = 1,
> > +	.plane_fmt = {
> > +		[0] = {
> > +			.sizeimage = 1920 * 1080,
> > +			.bytesperline = 1920,
> > +		},
> > +	},
> > +};
> > +
> > +/*
> > + * Apply the given width, height and fourcc to the subdevice inside the
> > + * VSP1 entity.
>
> s/entity/RWPF/ as you only deal with RPFs and WPFs here. Or
>
> /* Apply the given width, height and fourcc to the RWPF's subdevice */
>

ack

> > + */
> > +static int vsp1_vspx_rwpf_set_subdev_fmt(struct vsp1_device *vsp1,
> > +					 struct vsp1_rwpf *rwpf,
> > +					 u32 isp_fourcc,
> > +					 unsigned int width,
> > +					 unsigned int height)
> > +{
> > +	struct vsp1_entity *ent = &rwpf->entity;
> > +	const struct vsp1_format_info *fmtinfo;
> > +	struct v4l2_subdev_format format = {};
> > +	u32 vspx_fourcc;
> > +
> > +	switch (isp_fourcc) {
> > +	case V4L2_PIX_FMT_GREY:
> > +		/* 8 bit RAW Bayer image. */
> > +		vspx_fourcc = V4L2_PIX_FMT_RGB332;
> > +		break;
> > +	case V4L2_PIX_FMT_Y10:
> > +	case V4L2_PIX_FMT_Y12:
> > +	case V4L2_PIX_FMT_Y16:
> > +		/* 10, 12 and 16 bit RAW Bayer image. */
> > +		vspx_fourcc = V4L2_PIX_FMT_RGB565;
> > +		break;
> > +	case V4L2_META_FMT_GENERIC_8:
> > +		/* ConfigDMA parameters buffer. */
> > +		vspx_fourcc = V4L2_PIX_FMT_XBGR32;
> > +		break;
> > +	default:
> > +		return -EINVAL;
> > +	}
> > +
> > +	fmtinfo = vsp1_get_format_info(vsp1, vspx_fourcc);
> > +	rwpf->fmtinfo = fmtinfo;
>
> 	rwpf->fmtinfo = vsp1_get_format_info(vsp1, vspx_fourcc);
>
> and drop the local variable.
>

ok

> > +
> > +	format.which = V4L2_SUBDEV_FORMAT_ACTIVE;
> > +	format.pad = RWPF_PAD_SINK;
> > +	format.format.width = width;
> > +	format.format.height = height;
> > +	format.format.field = V4L2_FIELD_NONE;
> > +	format.format.code = fmtinfo->mbus;
> > +
> > +	return v4l2_subdev_call(&ent->subdev, pad, set_fmt, NULL, &format);
> > +}
> > +
> > +/* Configure RPF0 for ConfigDMA and RAW image transfer. */
>
> s/and/or/
>
> But you also configure the WPF, so I'd write
>
> /* Configure the RPF -> IIF -> WPF pipeline for ConfigDMA or RAW image transfer. */
>
> and rename the function to vsp1_vspx_pipeline_configure().

ok

>
> > +static int vsp1_vspx_rpf0_configure(struct vsp1_device *vsp1,
> > +				    dma_addr_t addr, u32 isp_fourcc,
> > +				    unsigned int width, unsigned int height,
> > +				    unsigned int stride,
> > +				    unsigned int iif_sink_pad,
> > +				    struct vsp1_dl_list *dl,
> > +				    struct vsp1_dl_body *dlb)
> > +{
> > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > +	struct vsp1_rwpf *rpf0 = pipe->inputs[0];
> > +	int ret;
> > +
> > +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, rpf0, isp_fourcc, width,
> > +					    height);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output, isp_fourcc,
> > +					    width, height);
> > +	if (ret)
> > +		return ret;
> > +
> > +	vsp1_pipeline_calculate_partition(pipe, &pipe->part_table[0],
> > +					  width, 0);
> > +
>
> You should also set rwpf->format.num_planes to 1 here.
>

ack

> > +	rpf0->format.plane_fmt[0].bytesperline = stride;
> > +
> > +	/*
> > +	 * Connect RPF0 to the IIF sink pad corresponding to the config or image
> > +	 * path.
> > +	 */
> > +	rpf0->entity.sink_pad = iif_sink_pad;
> > +
> > +	pipe->part_table[0].rpf[0].width = width;
> > +	pipe->part_table[0].rpf[0].height = height;
>
> Isn't this handled by vsp1_pipeline_calculate_partition() ?
>

I don't see it happening in the vsp1_pipeline_calculate_partition()
call chain...

vsp1_pipeline_calculate_partition() calls
vsp1_pipeline_propagate_partition() which calls the 'partition' op on
entities in the pipeline.

The RPF implementation of the 'partition' op however initializes the
crop retangles on the entity, but not the part_table[].

To be honest that partition part is still a bit obscure to me, so I
might be missing something indeed

> > +
> > +	rpf0->mem.addr[0] = addr;
> > +	rpf0->mem.addr[1] = 0;
> > +	rpf0->mem.addr[2] = 0;
> > +
> > +	vsp1_entity_route_setup(&rpf0->entity, pipe, dlb);
> > +	vsp1_entity_configure_stream(&rpf0->entity, rpf0->entity.state, pipe,
> > +				     dl, dlb);
> > +	vsp1_entity_configure_partition(&rpf0->entity, pipe,
> > +					&pipe->part_table[0], dl, dlb);
> > +
> > +	return 0;
> > +}
> > +
> > +/* -----------------------------------------------------------------------------
> > + * Interrupt handling
> > + */
> > +
> > +static void vsp1_vspx_pipeline_frame_end(struct vsp1_pipeline *pipe,
> > +					 unsigned int completion)
> > +{
> > +	struct vsp1_vspx_pipeline *vspx_pipe = to_vsp1_vspx_pipeline(pipe);
> > +
> > +	if (vspx_pipe->vspx_frame_end)
> > +		vspx_pipe->vspx_frame_end(vspx_pipe->frame_end_data);
> > +}
> > +
> > +/* -----------------------------------------------------------------------------
> > + * ISP Driver API (include/media/vsp1.h)
> > + */
> > +
> > +/**
> > + * vsp1_isp_init() - Initialize the VSPX
> > + *
> > + * @dev: The VSP1 struct device
> > + *
> > + * Return: %0 on success or a negative error code on failure
> > + */
> > +int vsp1_isp_init(struct device *dev)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +
> > +	if (!vsp1)
> > +		return -EPROBE_DEFER;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_init);
> > +
> > +/**
> > + * vsp1_isp_get_bus_master - Get VSPX bus master
> > + *
> > + * The VSPX accesses memory through an FCPX instance. When allocating memory
> > + * buffers that will have to be accessed by the VSPX the 'struct device' of
> > + * the FCPX should be used. Use this function to get a reference to it.
>
> The function description should go after parameters. Same elsewhere
> where applicable.
>

Ah indeed, I got all of these wrong :/

> > + *
> > + * @dev: The VSP1 struct device
> > + *
> > + * Return: a pointer to the bus master's device
> > + */
> > +struct device *vsp1_isp_get_bus_master(struct device *dev)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +
> > +	if (!vsp1)
> > +		return ERR_PTR(-ENODEV);
> > +
> > +	return vsp1->bus_master;
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_get_bus_master);
> > +
> > +/**
> > + * vsp1_isp_alloc_buffer - Allocate a buffer in the VSPX address space
> > + *
> > + * Allocate a buffer that will be later accessed by the VSPX. Buffers allocated
> > + * using vsp1_isp_alloc_buffer() shall be released with a call to
> > + * vsp1_isp_free_buffer().
>
> I would explain somewhere that this is meant for ISP configuration
> parameters.
>

I'll add

      * This function is used by the ISP driver to allocate memory for
      * the ConfigDMA parameters buffer.

> > + *
> > + * @dev: The VSP1 struct device
> > + * @size: The size of the buffer to be allocated by the VSPX
> > + * @buffer_desc: The allocated buffer descriptor, will be filled with the
> > + *		 buffer CPU-mapped address, the bus address and the allocated
> > + *		 buffer size
> > + *
> > + * Return: %0 on success or a negative error code on failure
> > + */
> > +int vsp1_isp_alloc_buffer(struct device *dev, size_t size,
> > +			  struct vsp1_isp_buffer_desc *buffer_desc)
> > +{
> > +	struct device *bus_master = vsp1_isp_get_bus_master(dev);
> > +
> > +	if (IS_ERR_OR_NULL(bus_master))
> > +		return -ENODEV;
> > +
> > +	buffer_desc->cpu_addr = dma_alloc_coherent(bus_master, size,
> > +						   &buffer_desc->dma_addr,
> > +						   GFP_KERNEL);
> > +	if (!buffer_desc->cpu_addr)
> > +		return -ENOMEM;
> > +
> > +	buffer_desc->size = size;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_alloc_buffer);
> > +
> > +/**
> > + * vsp1_isp_free_buffer - Release a buffer allocated by vsp1_isp_alloc_buffer()
> > + *
> > + * Release memory in the VSPX address space allocated by
> > + * vsp1_isp_alloc_buffer().
> > + *
> > + * @dev: The VSP1 struct device
> > + * @buffer_desc: The descriptor of the buffer to release as returned by
> > + *		 vsp1_isp_alloc_buffer()
> > + */
> > +void vsp1_isp_free_buffer(struct device *dev,
> > +			  struct vsp1_isp_buffer_desc *buffer_desc)
> > +{
> > +	struct device *bus_master = vsp1_isp_get_bus_master(dev);
> > +
> > +	if (IS_ERR_OR_NULL(bus_master))
> > +		return;
> > +
> > +	dma_free_coherent(bus_master, buffer_desc->size, buffer_desc->cpu_addr,
> > +			  buffer_desc->dma_addr);
> > +}
> > +
> > +/**
> > + * vsp1_isp_start_streaming - Start processing VSPX jobs
> > + *
> > + * Start the VSPX and prepare for accepting buffer transfer job requests.
> > + *
> > + * @dev: The VSP1 struct device
> > + * @frame_end: The frame end callback description
> > + *
> > + * Return: %0 on success or a negative error code on failure
> > + */
> > +int vsp1_isp_start_streaming(struct device *dev,
> > +			     struct vsp1_vspx_frame_end *frame_end)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > +	unsigned long flags;
> > +	u32 value;
> > +	int ret;
> > +
> > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
>
> Can this function ever be called with interrupts disabled ? If no, you
> can use spin_lock_irq().
>

I think the question here should rather be: do we need to disable
local interrupts at all when calling this function ? As the VSPX
workflow is now driven by ISP and there are no contentions between any
of the driver functions and the VSPX interrupt handler I guess I can
use spin_lock() and spin_unlock() everywhere and replace the
irqsave/irqrestore versions ?

> > +
> > +	if (vspx_pipe->enabled) {
> > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > +		return 0;
>
> Shouldn't this be an error ?
>

Should it ? I don't think it's an error conceptually, as the ISP
driver is currently designed however a single call to this function
should happen, so I can flag this as an error.

However, as S_STREAM(1) is called on every video node, I would not
rule out a design where each video dev tries to start the VSPX. The
ones that arrive late will just hit a nop.

> > +	}
> > +
> > +	if (!frame_end) {
>
> You can move this check before taking the lock.
>

True that

> > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > +		return -EINVAL;
> > +	}
> > +
> > +	vspx_pipe->vspx_frame_end = frame_end->vspx_frame_end;
> > +	vspx_pipe->frame_end_data = frame_end->frame_end_data;
>
> Move this just before setting ->enabled to true, so you won't override
> the values if the function returns an error in the checks below.
>
> > +
> > +	/* Make sure VSPX is not active. */
>
> This should never happen unless there's a serious bug in the driver,
> right ? I would make that explicit in the comment:
>
> 	/*
> 	 * Make sure VSPX is not active. This should never happen in normal
> 	 * usage.
> 	 */

ok

>
> > +	value = vsp1_read(vsp1, VI6_CMD(0));
> > +	if (value & VI6_CMD_STRCMD) {
> > +		dev_err(vsp1->dev,
> > +			"%s: Starting of WPF0 already reserved\n", __func__);
> > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > +		return -EBUSY;
> > +	}
> > +
> > +	value = vsp1_read(vsp1, VI6_STATUS);
> > +	if (value & VI6_STATUS_SYS_ACT(0)) {
> > +		dev_err(vsp1->dev,
> > +			"%s: WPF0 has not entered idle state\n", __func__);
> > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > +		return -EBUSY;
> > +	}
> > +
> > +	vspx_pipe->enabled = true;
> > +
> > +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
>
> and close the guarded scope here
>
> 	}
>

I found it to be less readable because of the increased scoping.

> > +
> > +	/* Enable the VSP1 and prepare for streaming. */
> > +	vsp1_pipeline_dump(pipe, "VSPX job");
> > +
> > +	ret = vsp1_device_get(vsp1);
>
> This should be done before you read the registers.
>

Ah right. I guess this function should be then re-designed.


> > +	if (ret < 0) {
> > +		guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
> > +		vspx_pipe->enabled = false;
> > +		return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_start_streaming);
> > +
> > +/**
> > + * vsp1_isp_stop_streaming - Stop the VSPX
> > + *
> > + * @dev: The VSP1 struct device
> > + */
> > +void vsp1_isp_stop_streaming(struct device *dev)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> > +
> > +	if (!vspx_pipe->enabled) {
> > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > +		return;
> > +	}
>
> 	scoped_guard(spinlock_irqsave)(&vspx_pipe->vspx_lock, flags) {
> 		if (!vspx_pipe->enabled)
> 			return;
>
> 		...
>
> (or spinlock_irq, see comment in vsp1_isp_start_streaming()).
>

Am I mistaken that the _irq version you suggest still disables
local interrupts, which is something I shouldn't need

-------------------------------------------------------------------------------
From "Documentation/kernel-hacking/locking.rst"

If a hardware irq handler shares data with a softirq, you have two
concerns. Firstly, the softirq processing can be interrupted by a
hardware interrupt, and secondly, the critical region could be entered
by a hardware interrupt on another CPU. This is where
spin_lock_irq() is used. It is defined to disable
interrupts on that cpu, then grab the lock.
spin_unlock_irq() does the reverse.
-------------------------------------------------------------------------------

I guess I should simply use spin_lock/spin_unlock

> > +
> > +	vspx_pipe->enabled = false;
> > +
> > +	pipe->state = VSP1_PIPELINE_STOPPED;
> > +	vsp1_pipeline_stop(pipe);
>
> You can't call this with a spin lock held. We can look together at how
> to handle locking in this driver, and use a mutex instead of a spinlock
> (I know you went the other way around).

Actually my only concern with mutexes is that they can't be called by
the ISP driver while it holds a spinlock. With this new version the
only function that would need to be called with a spinlock taken is
vsp1_isp_job_run() and vsp1_isp_jobs_release() which only contends the
jobs_queue with the rest of the functions of the VSPX driver.

I could use a mutex as a main lock here and a spinlock to protect the
jobs queue.

Or I could simply call the vsp1_pipeline_stop() outside of the
critical section here.

>
> > +	vspx_pipe->vspx_frame_end = NULL;
> > +	vsp1_dlm_reset(pipe->output->dlm);
> > +
> > +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
>
> 	}
>
> > +
> > +	vsp1_device_put(vsp1);
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_stop_streaming);
> > +
> > +/**
> > + * vsp1_vspx_job_prepare - Prepare a new buffer transfer job
> > + *
> > + * Prepare a new buffer transfer job by populating a display list that will be
> > + * later executed by a call to vsp1_isp_job_run().
> > + *
> > + * The newly created job is appended to the queue of pending jobs. All pending
> > + * jobs must be released after stopping the streaming operations with a call
> > + * to vsp1_isp_jobs_release().
> > + *
> > + * @dev: The VSP1 struct device
> > + * @job: The job description
> > + *
> > + * Return: %0 on success or a negative error code on failure
> > + */
> > +int vsp1_isp_job_prepare(struct device *dev,
> > +			 const struct vsp1_isp_job_desc *desc)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > +	const struct v4l2_pix_format_mplane *pix_mp;
> > +	struct vsp1_dl_list *second_dl = NULL;
> > +	struct vsp1_vspx_job *job;
> > +	struct vsp1_dl_body *dlb;
> > +	struct vsp1_dl_list *dl;
> > +	int ret;
> > +
> > +	/* Memory will be released when the job is consumed. */
> > +	job = kmalloc(sizeof(*job), GFP_KERNEL);
> > +	if (!job)
> > +		return -ENOMEM;
>
> Hmmm... I think this can be improved. I would need to see how the ISP
> driver uses this API, but I'm thinking the job data could be stored in
> vsp1_isp_job_desc, without adding the job to a queue in this function.
> The job would then be passed to the run function, which would queue it.
> I think the resulting API would be cleaner. You will need an extra
> function to delete/destroy/cleanup a job that hasn't been queued (the
> counterpart of this function, essentially).
>

I'm not sure I got this to be honest.

A VSPX job is basically just a display list + a list handle.
Are you suggesting add the display list to vsp1_isp_job_desc and keep
a queue of jobs on the ISP side only ? Then pass the dl in to
vsp1_isp_job_run() ? This would avoid maintaing a queue of jobs on the
VSPX side at all most probably and save me the hassle to cleanup the
jobs queue on the VSPX side.

I'll see how this could look.

> > +
> > +	/*
> > +	 * Transfer the buffers described in the job: an optional ConfigDMA
> > +	 * parameters buffer and a RAW image.
> > +	 */
> > +
> > +	job->dl = vsp1_dl_list_get(pipe->output->dlm);
> > +	if (!job->dl) {
> > +		ret = -ENOMEM;
> > +		goto error_free_job;
> > +	}
> > +
> > +	dl = job->dl;
> > +	dlb = vsp1_dl_list_get_body0(dl);
> > +
> > +	/* Disable RPF1. */
> > +	vsp1_dl_body_write(dlb, vsp1->rpf[1]->entity.route->reg,
> > +			   VI6_DPR_NODE_UNUSED);
>
> The route is disabled in vsp1_device_init(), and we never use RPF1. I
> think this can be dropped.
>

ack

> > +
> > +	/* Configure IIF routing and enable IIF function */
>
> s/function/function./
>
> > +	vsp1_entity_route_setup(pipe->iif, pipe, dlb);
> > +	vsp1_entity_configure_stream(pipe->iif, pipe->iif->state, pipe,
> > +				     dl, dlb);
> > +
> > +	/* Configure WPF0 to enable RPF0 as source*/
>
> s/source/source. /
>
> > +	vsp1_entity_route_setup(&pipe->output->entity, pipe, dlb);
> > +	vsp1_entity_configure_stream(&pipe->output->entity,
> > +				     pipe->output->entity.state, pipe,
> > +				     dl, dlb);
>
> I'm kind of tempting to call route_setup, configure_stream,
> configure_frame and configure_partition by iterating over the entities
> in the pipe, as done by vsp1_du_pipeline_configure(). The code would be
> more generic, at the cost of a few calls being duplicated between the
> config DMA and image configurations. What do you think ?
>

The fact that the RPF0 is optionally configurable for ConfigDMA

> > +
> > +	if (desc->config.pairs) {

here

makes it hard for me to see how this could be done cleanly.

To be honest, the pipeline is fixed (rpf->iif->wpf) so I don't see
much benefits here


> > +		/*
> > +		 * Configure RPF0 for config data. Transfer the number of
> > +		 * configuration pairs plus 2 words for the header.
> > +		 */
> > +		ret = vsp1_vspx_rpf0_configure(vsp1, desc->config.mem,
> > +					       V4L2_META_FMT_GENERIC_8,
> > +					       desc->config.pairs * 2 + 2, 1,
> > +					       desc->config.pairs * 2 + 2,
> > +					       VSPX_IIF_SINK_PAD_CONFIG,
> > +					       dl, dlb);
> > +		if (ret)
> > +			goto error_put_dl;
> > +
> > +		second_dl = vsp1_dl_list_get(pipe->output->dlm);
> > +		if (!second_dl) {
> > +			ret = -ENOMEM;
> > +			goto error_put_dl;
> > +		}
> > +
> > +		dl = second_dl;
> > +		dlb = vsp1_dl_list_get_body0(dl);
> > +	}
> > +
> > +	/* Configure RPF0 for RAW image transfer. */
> > +	pix_mp = &desc->img.fmt.fmt.pix_mp;
> > +	ret = vsp1_vspx_rpf0_configure(vsp1, desc->img.mem,
> > +				       pix_mp->pixelformat,
> > +				       pix_mp->width, pix_mp->height,
> > +				       pix_mp->plane_fmt[0].bytesperline,
> > +				       VSPX_IIF_SINK_PAD_IMG, dl, dlb);
> > +	if (ret)
> > +		goto error_put_dl;
> > +
> > +	if (second_dl)
> > +		vsp1_dl_list_add_chain(job->dl, second_dl);
> > +
> > +	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
> > +		list_add_tail(&job->job_queue, &vspx_pipe->jobs);
> > +	}
> > +
> > +	return 0;
> > +
> > +error_put_dl:
> > +	if (second_dl)
> > +		vsp1_dl_list_put(second_dl);
> > +	vsp1_dl_list_put(job->dl);
> > +error_free_job:
> > +	kfree(job);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_job_prepare);
> > +
> > +/**
> > + * vsp1_isp_job_run - Run the first pending transfer job
> > + *
> > + * Extract the first available job from the pending job queue and execute the
> > + * display list there contained. The ISP driver shall call this function when a
> > + * new transfer of a ConfigDMA parameters buffer and RAW image is required.
> > + *
> > + * @dev: The VSP1 struct device
> > + *
> > + * Return: %0 on success or a negative error code on failure
> > + */
> > +int vsp1_isp_job_run(struct device *dev)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > +	struct vsp1_vspx_job *job;
> > +	u32 value;
> > +
> > +	/* Make sure VSPX is not busy processing a frame. */
> > +	value = vsp1_read(vsp1, VI6_CMD(0));
> > +	if (value) {
> > +		dev_err(vsp1->dev,
> > +			"%s: Starting of WPF0 already reserved\n", __func__);
> > +		return -EBUSY;
> > +	}
> > +
> > +	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
> > +
> > +		job = list_first_entry_or_null(&vspx_pipe->jobs,
> > +					       struct vsp1_vspx_job,
> > +					       job_queue);
> > +		if (!job)
> > +			return -ENODEV;
> > +
> > +		list_del(&job->job_queue);
> > +
> > +		/* Apply the display list. */
> > +		vsp1_dl_list_commit(job->dl, 0);
> > +		kfree(job);
> > +	}
> > +
> > +	scoped_guard(spinlock_irqsave, &pipe->irqlock) {
> > +		/*
> > +		 * Operating the vsp1_pipe in singleshot mode requires to
> > +		 * manually set the pipeline state to stopped.
> > +		 */
> > +		pipe->state = VSP1_PIPELINE_STOPPED;
> > +		vsp1_pipeline_run(pipe);
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_job_run);
> > +
> > +/**
> > + * vsp1_isp_jobs_release - Release all pending VSPX jobs
> > + *
> > + * Release jobs prepared by a call to vsp1_vspx_job_prepare() and not yet
> > + * processed. Pending jobs shall be released after streaming has been stopped.
> > + *
> > + * @dev: The VSP1 struct device
> > + */
> > +void vsp1_isp_jobs_release(struct device *dev)
> > +{
> > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > +	struct vsp1_vspx_job *job, *tmp;
> > +
> > +	guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
> > +
> > +	list_for_each_entry_safe(job, tmp, &vspx_pipe->jobs, job_queue) {
> > +		list_del(&job->job_queue);
> > +		vsp1_dl_list_put(job->dl);
> > +		kfree(job);
> > +	}
> > +}
> > +EXPORT_SYMBOL_GPL(vsp1_isp_jobs_release);
> > +
> > +/* -----------------------------------------------------------------------------
> > + * Initialization and cleanup
> > + */
> > +
> > +int vsp1_vspx_init(struct vsp1_device *vsp1)
> > +{
> > +	struct vsp1_vspx_pipeline *vspx_pipe;
> > +	struct vsp1_pipeline *pipe;
> > +
> > +	vsp1->vspx = devm_kzalloc(vsp1->dev, sizeof(*vsp1->vspx), GFP_KERNEL);
> > +	if (!vsp1->vspx)
> > +		return -ENOMEM;
> > +
> > +	vsp1->vspx->vsp1 = vsp1;
> > +
> > +	vspx_pipe = &vsp1->vspx->pipe;
> > +	vspx_pipe->enabled = false;
> > +
> > +	pipe = &vspx_pipe->pipe;
> > +
> > +	vsp1_pipeline_init(pipe);
> > +
> > +	pipe->partitions = 1;
> > +	pipe->part_table = &vspx_pipe->partition;
> > +	pipe->interlaced = false;
> > +	pipe->frame_end = vsp1_vspx_pipeline_frame_end;
> > +
> > +	INIT_LIST_HEAD(&vspx_pipe->jobs);
> > +	spin_lock_init(&vspx_pipe->vspx_lock);
> > +
> > +	/*
> > +	 * Initialize RPF0 as input for VSPX and use it unconditionally for
> > +	 * now.
> > +	 */
> > +	pipe->inputs[0] = vsp1->rpf[0];
> > +	pipe->inputs[0]->entity.pipe = pipe;
> > +	pipe->inputs[0]->entity.sink = &vsp1->iif->entity;
> > +	vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->inputs[0],
> > +				      vspx_default_fmt.pixelformat,
> > +				      vspx_default_fmt.width,
> > +				      vspx_default_fmt.height);
>
> Is there an actual need for this default configuration, given that you
> always reprogram the RPF and WPF when preparing a job ? I think you
> could just drop the call, and the vspx_default_fmt structure.
>

No need, was just to get a sane default. But as you say, it will get
overridden anyway

> > +	list_add(&pipe->inputs[0]->entity.list_pipe, &pipe->entities);
>
> s/list_add/list_add_tail/
>
> It won't make any functional difference, but will align the code with
> the calls below.
>

It's the first member, I though add() followed by add_tail()
add_tail() would make it clearer what the order is.

Thanks
  j

> > +
> > +	pipe->iif = &vsp1->iif->entity;
> > +	pipe->iif->pipe = pipe;
> > +	pipe->iif->sink = &vsp1->wpf[0]->entity;
> > +	pipe->iif->sink_pad = RWPF_PAD_SINK;
> > +	list_add_tail(&pipe->iif->list_pipe, &pipe->entities);
> > +
> > +	pipe->output = vsp1->wpf[0];
> > +	pipe->output->entity.pipe = pipe;
> > +	vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output,
> > +				      vspx_default_fmt.pixelformat,
> > +				      vspx_default_fmt.width,
> > +				      vspx_default_fmt.height);
>
> Same here.
>
> > +	list_add_tail(&pipe->output->entity.list_pipe, &pipe->entities);
> > +
> > +	return 0;
> > +}
> > +
> > +void vsp1_vspx_cleanup(struct vsp1_device *vsp1)
> > +{
> > +	vsp1_isp_jobs_release(vsp1->dev);
> > +}
> > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.h b/drivers/media/platform/renesas/vsp1/vsp1_vspx.h
> > new file mode 100644
> > index 000000000000..f871bf9e7dec
> > --- /dev/null
> > +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.h
> > @@ -0,0 +1,16 @@
> > +/* SPDX-License-Identifier: GPL-2.0+ */
> > +/*
> > + * vsp1_vspx.h  --  R-Car Gen 4 VSPX
> > + *
> > + * Copyright (C) 2025 Ideas On Board Oy
> > + * Copyright (C) 2025 Renesas Electronics Corporation
> > + */
> > +#ifndef __VSP1_VSPX_H__
> > +#define __VSP1_VSPX_H__
> > +
> > +#include "vsp1.h"
> > +
> > +int vsp1_vspx_init(struct vsp1_device *vsp1);
> > +void vsp1_vspx_cleanup(struct vsp1_device *vsp1);
> > +
> > +#endif /* __VSP1_VSPX_H__ */
> > diff --git a/include/media/vsp1.h b/include/media/vsp1.h
> > index 48f4a5023d81..6843e8e85520 100644
> > --- a/include/media/vsp1.h
> > +++ b/include/media/vsp1.h
> > @@ -15,6 +15,10 @@
> >
> >  struct device;
> >
> > +/* -----------------------------------------------------------------------------
> > + * VSP1 DU interface
> > + */
> > +
> >  int vsp1_du_init(struct device *dev);
> >
> >  #define VSP1_DU_STATUS_COMPLETE		BIT(0)
> > @@ -117,4 +121,75 @@ void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index,
> >  int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt);
> >  void vsp1_du_unmap_sg(struct device *dev, struct sg_table *sgt);
> >
> > +/* -----------------------------------------------------------------------------
> > + * VSP1 ISP interface
> > + */
> > +
> > +/**
> > + * struct vsp1_isp_buffer_desc - Describe a buffer allocated by VSPX
> > + *
> > + * @size: Byte size of the buffer allocated by VSPX
> > + * @cpu_addr: CPU-mapped address of a buffer allocated by VSPX
> > + * @dma_addr: bus address of a buffer allocated by VSPX
> > + */
> > +struct vsp1_isp_buffer_desc {
> > +	size_t size;
> > +	void *cpu_addr;
> > +	dma_addr_t dma_addr;
> > +};
> > +
> > +/**
> > + * struct vsp1_isp_job_desc - Describe a VSPX buffer transfer request
> > + *
> > + * Describe a transfer request for the VSPX to perform on behalf of the ISP.
> > + * The transfer job descriptor contains an optional ConfigDMA buffer and
> > + * one RAW image buffer. Set config.pairs to 0 if no ConfigDMA buffer should
> > + * be transferred.
> > + *
> > + * @config: ConfigDMA buffer
> > + * @config.pairs: number of reg-value pairs in the ConfigDMA buffer
> > + * @config.mem: bus address of the ConfigDMA buffer
> > + * @img: RAW image buffer
> > + * @img.fmt: RAW image format
> > + * @img.mem: bus address of the RAW image buffer
> > + */
> > +struct vsp1_isp_job_desc {
> > +	struct {
> > +		unsigned int pairs;
> > +		dma_addr_t mem;
> > +	} config;
> > +	struct {
> > +		struct v4l2_format fmt;
> > +		dma_addr_t mem;
> > +	} img;
> > +};
> > +
> > +/**
> > + * struct vsp1_vspx_frame_end - VSPX frame end callback data
> > + *
> > + * @vspx_frame_end: Frame end callback. Called after a transfer job has been
> > + *		    completed. If the job includes both a ConfigDMA and a
> > + *		    RAW image, the callback is called after both have been
> > + *		    transferred
> > + * @frame_end_data: Frame end callback data, passed to vspx_frame_end
> > + */
> > +struct vsp1_vspx_frame_end {
> > +	void (*vspx_frame_end)(void *data);
> > +	void *frame_end_data;
> > +};
> > +
> > +int vsp1_isp_init(struct device *dev);
> > +struct device *vsp1_isp_get_bus_master(struct device *dev);
> > +int vsp1_isp_alloc_buffer(struct device *dev, size_t size,
> > +			  struct vsp1_isp_buffer_desc *buffer_desc);
> > +void vsp1_isp_free_buffer(struct device *dev,
> > +			  struct vsp1_isp_buffer_desc *buffer_desc);
> > +int vsp1_isp_start_streaming(struct device *dev,
> > +			     struct vsp1_vspx_frame_end *frame_end);
> > +void vsp1_isp_stop_streaming(struct device *dev);
> > +int vsp1_isp_job_prepare(struct device *dev,
> > +			 const struct vsp1_isp_job_desc *desc);
> > +int vsp1_isp_job_run(struct device *dev);
> > +void vsp1_isp_jobs_release(struct device *dev);
> > +
> >  #endif /* __MEDIA_VSP1_H__ */
> >
> > ---
> > base-commit: 1d41f477d6ff5f5eb0b78b37644ffac0785602c9
> > change-id: 20250502-b4-vspx-90c815bff6dd
>
> --
> Regards,
>
> Laurent Pinchart
Re: [PATCH v8] media: vsp1: Add VSPX support
Posted by Laurent Pinchart 7 months, 3 weeks ago
Hi Jacopo,

I realize I haven't replied to this e-mail. Sorry about that. I've added
a few comments below related to issues that I think are still relevant
to v10.

On Mon, May 05, 2025 at 10:19:57AM +0200, Jacopo Mondi wrote:
> On Fri, May 02, 2025 at 11:26:44PM +0300, Laurent Pinchart wrote:
> > On Fri, May 02, 2025 at 03:23:10PM +0200, Jacopo Mondi wrote:
> > > Add support for VSPX, a specialized version of the VSP2 that
> > > transfers data to the ISP. The VSPX is composed of two RPF units
> > > to read data from external memory and an IIF instance that performs
> > > transfer towards the ISP.
> > >
> > > The VSPX is supported through a newly introduced vsp1_vspx.c file that
> > > exposes two interfaces: vsp1_vspx interface, declared in vsp1_vspx.h
> > > for the vsp1 core to initialize and cleanup the VSPX, and a vsp1_isp
> > > interface, declared in include/media/vsp1.h for the ISP driver to
> > > control the VSPX operations.
> > >
> > > Signed-off-by: Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
> > > ---
> > > The VSPX is a VSP2 function that reads data from external memory using
> > > two RPF instances and feed it to the ISP.
> > >
> > > The VSPX includes an IIF unit (ISP InterFace) modeled in the vsp1 driver
> > > as a new, simple, entity type.
> > >
> > > IIF is part of VSPX, a version of the VSP2 IP specialized for ISP
> > > interfacing. To prepare to support VSPX, support IIF first by
> > > introducing a new entity and by adjusting the RPF/WPF drivers to
> > > operate correctly when an IIF is present.
> > >
> > > Changes in v8:
> > > - Remove patches already collected by Laurent in
> > >   [GIT PULL FOR v6.16] Renesas media drivers changes
> > >
> > > - Rebased on
> > >   https://gitlab.freedesktop.org/linux-media/users/pinchartl.git #renesas-next
> > >
> > > - Changes to the VSPX interface towards the ISP
> > >   - Split start/stop_streaming
> > >   - Add vsp1_isp_jobs_release() to release pending jobs
> > >   - Add vsp1_isp_free_buffer()
> > >   - Remove vsp1_isp_configure() and compute partitions on job creation
> > >
> > > - Driver changes
> > >   - Drop irq-driver flow
> > >     The VSPX used to schedule new jobs as soon as processing the last
> > >     one is done. This doesn't work well with the R-Car ISP design
> > >     for two reasons:
> > >     - The ISP needs per-job registers programming
> > >     - The ISP and VSPX job queues have to stay in sync
> > >
> > > - Minors
> > >   - Remove the jobs_lock as a single lock is fine
> > >   - Protect against VSPX/ISP irq races in job_run() by checking the
> > >     VSPX 'busy' register and remove the 'processing' flag
> > >   - Manually set the pipeline state to STOPPED before scheduling a new
> > >     job without waiting for frame_end
> > >
> > > Changes in v7:
> > > - Include VSPX driver in the series
> > > - Use existing VSP1 formats and remove patches extending formats on RPF
> > > - Rework VSPX driver to split jobs creation and scheduling in two
> > >   different API entry points
> > > - Fix VSPX stride using the user provided bytesperline and using the
> > >   buffer size for ConfigDMA buffers
> > > - Link to v6: https://lore.kernel.org/r/20250321-v4h-iif-v6-0-361e9043026a@ideasonboard.com
> > >
> > > Changes in v6:
> > > - Little cosmetic change as suggested by Laurent
> > > - Collect tags
> > > - Link to v5: https://lore.kernel.org/r/20250319-v4h-iif-v5-0-0a10456d792c@ideasonboard.com
> > >
> > > Changes in v5:
> > > - Drop additional empty line 5/6
> > > - Link to v4: https://lore.kernel.org/r/20250318-v4h-iif-v4-0-10ed4c41c195@ideasonboard.com
> > >
> > > Changes in v4:
> > > - Fix SWAP bits for RAW10, RAW12 and RAW16
> > > - Link to v3: https://lore.kernel.org/r/20250317-v4h-iif-v3-0-63aab8982b50@ideasonboard.com
> > >
> > > Changes in v3:
> > > - Drop 2/6 from v2
> > > - Add 5/7 to prepare for a new implementation of 6/7
> > > - Individual changelog per patch
> > > - Add 7/7
> > > - Link to v2: https://lore.kernel.org/r/20250224-v4h-iif-v2-0-0305e3c1fe2d@ideasonboard.com
> > >
> > > Changes in v2:
> > > - Collect tags
> > > - Address review comments from Laurent, a lot of tiny changes here and
> > >   there but no major redesign worth an entry in the patchset changelog
> >
> > You've come a long way since v1, I think we're getting close to a good
> > implementation.
> 
> Thank you and Niklas for reviews and testing
> 
> > > ---
> > >  drivers/media/platform/renesas/vsp1/Makefile    |   1 +
> > >  drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
> > >  drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
> > >  drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
> > >  drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
> > >  drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
> > >  include/media/vsp1.h                            |  75 +++
> > >  7 files changed, 770 insertions(+), 1 deletion(-)

[snip]

> > > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > > new file mode 100644
> > > index 000000000000..6edb5e4929d9
> > > --- /dev/null
> > > +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > > @@ -0,0 +1,664 @@

[snip]

> > > +static int vsp1_vspx_rpf0_configure(struct vsp1_device *vsp1,
> > > +				    dma_addr_t addr, u32 isp_fourcc,
> > > +				    unsigned int width, unsigned int height,
> > > +				    unsigned int stride,
> > > +				    unsigned int iif_sink_pad,
> > > +				    struct vsp1_dl_list *dl,
> > > +				    struct vsp1_dl_body *dlb)
> > > +{
> > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > +	struct vsp1_rwpf *rpf0 = pipe->inputs[0];
> > > +	int ret;
> > > +
> > > +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, rpf0, isp_fourcc, width,
> > > +					    height);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output, isp_fourcc,
> > > +					    width, height);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	vsp1_pipeline_calculate_partition(pipe, &pipe->part_table[0],
> > > +					  width, 0);
> > > +
> >
> > You should also set rwpf->format.num_planes to 1 here.
> 
> ack
> 
> > > +	rpf0->format.plane_fmt[0].bytesperline = stride;
> > > +
> > > +	/*
> > > +	 * Connect RPF0 to the IIF sink pad corresponding to the config or image
> > > +	 * path.
> > > +	 */
> > > +	rpf0->entity.sink_pad = iif_sink_pad;
> > > +
> > > +	pipe->part_table[0].rpf[0].width = width;
> > > +	pipe->part_table[0].rpf[0].height = height;
> >
> > Isn't this handled by vsp1_pipeline_calculate_partition() ?
> 
> I don't see it happening in the vsp1_pipeline_calculate_partition()
> call chain...
> 
> vsp1_pipeline_calculate_partition() calls
> vsp1_pipeline_propagate_partition() which calls the 'partition' op on
> entities in the pipeline.
> 
> The RPF implementation of the 'partition' op however initializes the
> crop retangles on the entity, but not the part_table[].

The RPF .partition() handler is implemented as

static void rpf_partition(struct vsp1_entity *entity,
			  struct v4l2_subdev_state *state,
			  struct vsp1_pipeline *pipe,
			  struct vsp1_partition *partition,
			  unsigned int partition_idx,
			  struct v4l2_rect *window)
{
	struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
	struct v4l2_rect *rpf_rect = &partition->rpf[rpf->entity.index];

	/*
	 * Partition Algorithm Control
	 *
	 * The partition algorithm can split this frame into multiple slices. We
	 * must adjust our partition window based on the pipe configuration to
	 * match the destination partition window. To achieve this, we adjust
	 * our crop to provide a 'sub-crop' matching the expected partition
	 * window.
	 */
	*rpf_rect = *v4l2_subdev_state_get_crop(state, RWPF_PAD_SINK);

	if (pipe->partitions > 1) {
		rpf_rect->width = window->width;
		rpf_rect->left += window->left;
	}
}

The partition argument to the function is &pipe->part_table[0], so
rpf_rect is pipe->part_table[0].rpf[0].

Could you check if the values of pipe->part_table[0].rpf[0].width and
pipe->part_table[0].rpf[0].height match width and height after you call
vsp1_pipeline_calculate_partition() ?

> To be honest that partition part is still a bit obscure to me, so I
> might be missing something indeed
> 
> > > +
> > > +	rpf0->mem.addr[0] = addr;
> > > +	rpf0->mem.addr[1] = 0;
> > > +	rpf0->mem.addr[2] = 0;
> > > +
> > > +	vsp1_entity_route_setup(&rpf0->entity, pipe, dlb);
> > > +	vsp1_entity_configure_stream(&rpf0->entity, rpf0->entity.state, pipe,
> > > +				     dl, dlb);
> > > +	vsp1_entity_configure_partition(&rpf0->entity, pipe,
> > > +					&pipe->part_table[0], dl, dlb);
> > > +
> > > +	return 0;
> > > +}

[snip]

> > > +/**
> > > + * vsp1_isp_start_streaming - Start processing VSPX jobs
> > > + *
> > > + * Start the VSPX and prepare for accepting buffer transfer job requests.
> > > + *
> > > + * @dev: The VSP1 struct device
> > > + * @frame_end: The frame end callback description
> > > + *
> > > + * Return: %0 on success or a negative error code on failure
> > > + */
> > > +int vsp1_isp_start_streaming(struct device *dev,
> > > +			     struct vsp1_vspx_frame_end *frame_end)
> > > +{
> > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > +	unsigned long flags;
> > > +	u32 value;
> > > +	int ret;
> > > +
> > > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> >
> > Can this function ever be called with interrupts disabled ? If no, you
> > can use spin_lock_irq().
> 
> I think the question here should rather be: do we need to disable
> local interrupts at all when calling this function ? As the VSPX
> workflow is now driven by ISP and there are no contentions between any
> of the driver functions and the VSPX interrupt handler I guess I can
> use spin_lock() and spin_unlock() everywhere and replace the
> irqsave/irqrestore versions ?

It depends what the spinlock protects.

The point of the _irq and _irqsave variants is to prevent deadlocks. If
a code section protected by a spinlock is interrupted by an IRQ handler
on the same CPU, and the IRQ handler tries to take the same lock, a
deadlock will occur: the IRQ handler will spin on the lock to wait until
it gets released, but the lock will never get released as the code that
took it has been preempted by the IRQ handler. Note that such an IRQ
handler running on a different CPU is not an issue, it will spin on the
lock, but the lock will be released by the first CPU running in parallel
to the IRQ handler.

To solve that problem, the spin_lock_irq() and spin_unlock_irq()
functions respectively disable and enable local interrupts (i.e.
interrupts on the same CPU, not interrupts on other CPUs). They should
be called in contexts where interrupts are enabled (so not in an IRQ
handler, and not in a section covered by another spin_lock_irq() call or
uby other calls that disable local interrupts), and will ensure that no
IRQ handler can preempt the CPU.

Sometimes, you don't know if your code is running in a context where
interrupts are enabled or not. This is the case for instance in helper
functions that can be used in either contects. Using spin_lock_irq() is
safe there (disabling local interrupts when they are already disabled is
a no-op), but spin_unlock_irq() will enable local interrupts, which
would lead to problems. For these cases, spin_lock_irqsave() will save
the state of local interrupts and disable them, and
spin_unlock_irqrestore() will restore the local interrupts state to what
it was before spin_lock_irqsave().

It is safe to use spin_lock_irqsave() and spin_unlock_irqrestore()
unconditionally, but it has a performance impact. Using the right
variants isn't very difficult. Once you identify the data protected by
the lock, you need to look at where the data is accessed.

- If the data is accessed only in non-interrupt context, or only in
  interrupt context, use spin_lock() and spin_unlock() everywhere.

- If the data is accessed in both contexts, use

  - spin_lock() and spin_unlock() in code that runs only in contexts
    where local interrupts are disabled

  - spin_lock_irq() and spin_unlock_irq() in code that runs only in
    contexts where local interrupts are enabled

  - spin_lock_irqsave() and spin_unlock_irqrestore() in code that can
    run in either context

The vspx_lock is documented as "protect access to the VSPX configuration
and the jobs queue". That's quite vague, so the first thing you'll have
to do is to list the exact pieces of data that the lock protects, and
then decide on the locking scheme.

> > > +
> > > +	if (vspx_pipe->enabled) {
> > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > +		return 0;
> >
> > Shouldn't this be an error ?
> 
> Should it ? I don't think it's an error conceptually, as the ISP
> driver is currently designed however a single call to this function
> should happen, so I can flag this as an error.
> 
> However, as S_STREAM(1) is called on every video node, I would not
> rule out a design where each video dev tries to start the VSPX. The
> ones that arrive late will just hit a nop.

The API exposed by the vsp driver to the ISP driver needs to be designed
and clearly documented. There are different design options, but the API
should not rely on a particular call pattern from the ISP driver without
documenting it. As the documentation above doesn't indicate what the
call pattern is, I can't tell if the design is good or not.

This being said, if there are multiple userspace-facing devices that all
need to be started, someone will have to reference-count the VSP
start/stop state. Doing so in the vsp driver seems to add complexity
without any real benefit. I think that handling this in the ISP driver
would be better. I recommend making calling vsp1_isp_start_streaming()
on an already enabled VSP instance invalid, and returning an error. Same
when stopping the VSP, the ISP driver should decide when the VSP should
be stopped, and should not call vsp1_isp_stop_streaming() on an already
stopped instance.

> > > +	}
> > > +
> > > +	if (!frame_end) {
> >
> > You can move this check before taking the lock.
> 
> True that
> 
> > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	vspx_pipe->vspx_frame_end = frame_end->vspx_frame_end;
> > > +	vspx_pipe->frame_end_data = frame_end->frame_end_data;
> >
> > Move this just before setting ->enabled to true, so you won't override
> > the values if the function returns an error in the checks below.
> >
> > > +
> > > +	/* Make sure VSPX is not active. */
> >
> > This should never happen unless there's a serious bug in the driver,
> > right ? I would make that explicit in the comment:
> >
> > 	/*
> > 	 * Make sure VSPX is not active. This should never happen in normal
> > 	 * usage.
> > 	 */
> 
> ok
> 
> > > +	value = vsp1_read(vsp1, VI6_CMD(0));
> > > +	if (value & VI6_CMD_STRCMD) {
> > > +		dev_err(vsp1->dev,
> > > +			"%s: Starting of WPF0 already reserved\n", __func__);
> > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > +		return -EBUSY;
> > > +	}
> > > +
> > > +	value = vsp1_read(vsp1, VI6_STATUS);
> > > +	if (value & VI6_STATUS_SYS_ACT(0)) {
> > > +		dev_err(vsp1->dev,
> > > +			"%s: WPF0 has not entered idle state\n", __func__);
> > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > +		return -EBUSY;
> > > +	}
> > > +
> > > +	vspx_pipe->enabled = true;
> > > +
> > > +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> >
> > and close the guarded scope here
> >
> > 	}
> 
> I found it to be less readable because of the increased scoping.
> 
> > > +
> > > +	/* Enable the VSP1 and prepare for streaming. */
> > > +	vsp1_pipeline_dump(pipe, "VSPX job");
> > > +
> > > +	ret = vsp1_device_get(vsp1);
> >
> > This should be done before you read the registers.
> 
> Ah right. I guess this function should be then re-designed.
> 
> > > +	if (ret < 0) {
> > > +		guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
> > > +		vspx_pipe->enabled = false;
> > > +		return ret;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +EXPORT_SYMBOL_GPL(vsp1_isp_start_streaming);
> > > +
> > > +/**
> > > + * vsp1_isp_stop_streaming - Stop the VSPX
> > > + *
> > > + * @dev: The VSP1 struct device
> > > + */
> > > +void vsp1_isp_stop_streaming(struct device *dev)
> > > +{
> > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > +	unsigned long flags;
> > > +
> > > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> > > +
> > > +	if (!vspx_pipe->enabled) {
> > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > +		return;
> > > +	}
> >
> > 	scoped_guard(spinlock_irqsave)(&vspx_pipe->vspx_lock, flags) {
> > 		if (!vspx_pipe->enabled)
> > 			return;
> >
> > 		...
> >
> > (or spinlock_irq, see comment in vsp1_isp_start_streaming()).
> 
> Am I mistaken that the _irq version you suggest still disables
> local interrupts, which is something I shouldn't need
> 
> -------------------------------------------------------------------------------
> From "Documentation/kernel-hacking/locking.rst"
> 
> If a hardware irq handler shares data with a softirq, you have two
> concerns. Firstly, the softirq processing can be interrupted by a
> hardware interrupt, and secondly, the critical region could be entered
> by a hardware interrupt on another CPU. This is where
> spin_lock_irq() is used. It is defined to disable
> interrupts on that cpu, then grab the lock.
> spin_unlock_irq() does the reverse.
> -------------------------------------------------------------------------------
> 
> I guess I should simply use spin_lock/spin_unlock

See above.

> > > +
> > > +	vspx_pipe->enabled = false;
> > > +
> > > +	pipe->state = VSP1_PIPELINE_STOPPED;
> > > +	vsp1_pipeline_stop(pipe);
> >
> > You can't call this with a spin lock held. We can look together at how
> > to handle locking in this driver, and use a mutex instead of a spinlock
> > (I know you went the other way around).
> 
> Actually my only concern with mutexes is that they can't be called by
> the ISP driver while it holds a spinlock. With this new version the
> only function that would need to be called with a spinlock taken is
> vsp1_isp_job_run() and vsp1_isp_jobs_release() which only contends the
> jobs_queue with the rest of the functions of the VSPX driver.
> 
> I could use a mutex as a main lock here and a spinlock to protect the
> jobs queue.

I think you'll need to do that, using a mutex to protect the whole
start/stop functions, and a spinlock inside for specific fields of the
vsp1_vspx_pipeline structure that are also accessed in the interrupt
handler. Looking at v10, there's a race condition between start and stop
due to not covering the whole functions with a mutex.

> Or I could simply call the vsp1_pipeline_stop() outside of the
> critical section here.
> 
> > > +	vspx_pipe->vspx_frame_end = NULL;
> > > +	vsp1_dlm_reset(pipe->output->dlm);
> > > +
> > > +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> >
> > 	}
> >
> > > +
> > > +	vsp1_device_put(vsp1);
> > > +}
> > > +EXPORT_SYMBOL_GPL(vsp1_isp_stop_streaming);
> > > +
> > > +/**
> > > + * vsp1_vspx_job_prepare - Prepare a new buffer transfer job
> > > + *
> > > + * Prepare a new buffer transfer job by populating a display list that will be
> > > + * later executed by a call to vsp1_isp_job_run().
> > > + *
> > > + * The newly created job is appended to the queue of pending jobs. All pending
> > > + * jobs must be released after stopping the streaming operations with a call
> > > + * to vsp1_isp_jobs_release().
> > > + *
> > > + * @dev: The VSP1 struct device
> > > + * @job: The job description
> > > + *
> > > + * Return: %0 on success or a negative error code on failure
> > > + */
> > > +int vsp1_isp_job_prepare(struct device *dev,
> > > +			 const struct vsp1_isp_job_desc *desc)
> > > +{
> > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > +	const struct v4l2_pix_format_mplane *pix_mp;
> > > +	struct vsp1_dl_list *second_dl = NULL;
> > > +	struct vsp1_vspx_job *job;
> > > +	struct vsp1_dl_body *dlb;
> > > +	struct vsp1_dl_list *dl;
> > > +	int ret;
> > > +
> > > +	/* Memory will be released when the job is consumed. */
> > > +	job = kmalloc(sizeof(*job), GFP_KERNEL);
> > > +	if (!job)
> > > +		return -ENOMEM;
> >
> > Hmmm... I think this can be improved. I would need to see how the ISP
> > driver uses this API, but I'm thinking the job data could be stored in
> > vsp1_isp_job_desc, without adding the job to a queue in this function.
> > The job would then be passed to the run function, which would queue it.
> > I think the resulting API would be cleaner. You will need an extra
> > function to delete/destroy/cleanup a job that hasn't been queued (the
> > counterpart of this function, essentially).
> 
> I'm not sure I got this to be honest.
> 
> A VSPX job is basically just a display list + a list handle.
> Are you suggesting add the display list to vsp1_isp_job_desc and keep
> a queue of jobs on the ISP side only ? Then pass the dl in to
> vsp1_isp_job_run() ? This would avoid maintaing a queue of jobs on the
> VSPX side at all most probably and save me the hassle to cleanup the
> jobs queue on the VSPX side.
> 
> I'll see how this could look.

The implementation in v10 looks nicer.

> > > +
> > > +	/*
> > > +	 * Transfer the buffers described in the job: an optional ConfigDMA
> > > +	 * parameters buffer and a RAW image.
> > > +	 */
> > > +
> > > +	job->dl = vsp1_dl_list_get(pipe->output->dlm);
> > > +	if (!job->dl) {
> > > +		ret = -ENOMEM;
> > > +		goto error_free_job;
> > > +	}
> > > +
> > > +	dl = job->dl;
> > > +	dlb = vsp1_dl_list_get_body0(dl);
> > > +
> > > +	/* Disable RPF1. */
> > > +	vsp1_dl_body_write(dlb, vsp1->rpf[1]->entity.route->reg,
> > > +			   VI6_DPR_NODE_UNUSED);
> >
> > The route is disabled in vsp1_device_init(), and we never use RPF1. I
> > think this can be dropped.
> 
> ack
> 
> > > +
> > > +	/* Configure IIF routing and enable IIF function */
> >
> > s/function/function./
> >
> > > +	vsp1_entity_route_setup(pipe->iif, pipe, dlb);
> > > +	vsp1_entity_configure_stream(pipe->iif, pipe->iif->state, pipe,
> > > +				     dl, dlb);
> > > +
> > > +	/* Configure WPF0 to enable RPF0 as source*/
> >
> > s/source/source. /
> >
> > > +	vsp1_entity_route_setup(&pipe->output->entity, pipe, dlb);
> > > +	vsp1_entity_configure_stream(&pipe->output->entity,
> > > +				     pipe->output->entity.state, pipe,
> > > +				     dl, dlb);
> >
> > I'm kind of tempting to call route_setup, configure_stream,
> > configure_frame and configure_partition by iterating over the entities
> > in the pipe, as done by vsp1_du_pipeline_configure(). The code would be
> > more generic, at the cost of a few calls being duplicated between the
> > config DMA and image configurations. What do you think ?
> 
> The fact that the RPF0 is optionally configurable for ConfigDMA
> 
> > > +
> > > +	if (desc->config.pairs) {
> 
> here
> 
> makes it hard for me to see how this could be done cleanly.
> 
> To be honest, the pipeline is fixed (rpf->iif->wpf) so I don't see
> much benefits here

Agreed.

> > > +		/*
> > > +		 * Configure RPF0 for config data. Transfer the number of
> > > +		 * configuration pairs plus 2 words for the header.
> > > +		 */
> > > +		ret = vsp1_vspx_rpf0_configure(vsp1, desc->config.mem,
> > > +					       V4L2_META_FMT_GENERIC_8,
> > > +					       desc->config.pairs * 2 + 2, 1,
> > > +					       desc->config.pairs * 2 + 2,
> > > +					       VSPX_IIF_SINK_PAD_CONFIG,
> > > +					       dl, dlb);
> > > +		if (ret)
> > > +			goto error_put_dl;
> > > +
> > > +		second_dl = vsp1_dl_list_get(pipe->output->dlm);
> > > +		if (!second_dl) {
> > > +			ret = -ENOMEM;
> > > +			goto error_put_dl;
> > > +		}
> > > +
> > > +		dl = second_dl;
> > > +		dlb = vsp1_dl_list_get_body0(dl);
> > > +	}
> > > +
> > > +	/* Configure RPF0 for RAW image transfer. */
> > > +	pix_mp = &desc->img.fmt.fmt.pix_mp;
> > > +	ret = vsp1_vspx_rpf0_configure(vsp1, desc->img.mem,
> > > +				       pix_mp->pixelformat,
> > > +				       pix_mp->width, pix_mp->height,
> > > +				       pix_mp->plane_fmt[0].bytesperline,
> > > +				       VSPX_IIF_SINK_PAD_IMG, dl, dlb);
> > > +	if (ret)
> > > +		goto error_put_dl;
> > > +
> > > +	if (second_dl)
> > > +		vsp1_dl_list_add_chain(job->dl, second_dl);
> > > +
> > > +	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
> > > +		list_add_tail(&job->job_queue, &vspx_pipe->jobs);
> > > +	}
> > > +
> > > +	return 0;
> > > +
> > > +error_put_dl:
> > > +	if (second_dl)
> > > +		vsp1_dl_list_put(second_dl);
> > > +	vsp1_dl_list_put(job->dl);
> > > +error_free_job:
> > > +	kfree(job);
> > > +	return ret;
> > > +}
> > > +EXPORT_SYMBOL_GPL(vsp1_isp_job_prepare);

[snip]

-- 
Regards,

Laurent Pinchart
Re: [PATCH v8] media: vsp1: Add VSPX support
Posted by Jacopo Mondi 7 months, 3 weeks ago
Hi Laurent

On Sat, Jun 14, 2025 at 05:55:40PM +0300, Laurent Pinchart wrote:
> Hi Jacopo,
>
> I realize I haven't replied to this e-mail. Sorry about that. I've added
> a few comments below related to issues that I think are still relevant
> to v10.
>
> On Mon, May 05, 2025 at 10:19:57AM +0200, Jacopo Mondi wrote:
> > On Fri, May 02, 2025 at 11:26:44PM +0300, Laurent Pinchart wrote:
> > > On Fri, May 02, 2025 at 03:23:10PM +0200, Jacopo Mondi wrote:
> > > > Add support for VSPX, a specialized version of the VSP2 that
> > > > transfers data to the ISP. The VSPX is composed of two RPF units
> > > > to read data from external memory and an IIF instance that performs
> > > > transfer towards the ISP.
> > > >
> > > > The VSPX is supported through a newly introduced vsp1_vspx.c file that
> > > > exposes two interfaces: vsp1_vspx interface, declared in vsp1_vspx.h
> > > > for the vsp1 core to initialize and cleanup the VSPX, and a vsp1_isp
> > > > interface, declared in include/media/vsp1.h for the ISP driver to
> > > > control the VSPX operations.
> > > >
> > > > Signed-off-by: Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
> > > > ---
> > > > The VSPX is a VSP2 function that reads data from external memory using
> > > > two RPF instances and feed it to the ISP.
> > > >
> > > > The VSPX includes an IIF unit (ISP InterFace) modeled in the vsp1 driver
> > > > as a new, simple, entity type.
> > > >
> > > > IIF is part of VSPX, a version of the VSP2 IP specialized for ISP
> > > > interfacing. To prepare to support VSPX, support IIF first by
> > > > introducing a new entity and by adjusting the RPF/WPF drivers to
> > > > operate correctly when an IIF is present.
> > > >
> > > > Changes in v8:
> > > > - Remove patches already collected by Laurent in
> > > >   [GIT PULL FOR v6.16] Renesas media drivers changes
> > > >
> > > > - Rebased on
> > > >   https://gitlab.freedesktop.org/linux-media/users/pinchartl.git #renesas-next
> > > >
> > > > - Changes to the VSPX interface towards the ISP
> > > >   - Split start/stop_streaming
> > > >   - Add vsp1_isp_jobs_release() to release pending jobs
> > > >   - Add vsp1_isp_free_buffer()
> > > >   - Remove vsp1_isp_configure() and compute partitions on job creation
> > > >
> > > > - Driver changes
> > > >   - Drop irq-driver flow
> > > >     The VSPX used to schedule new jobs as soon as processing the last
> > > >     one is done. This doesn't work well with the R-Car ISP design
> > > >     for two reasons:
> > > >     - The ISP needs per-job registers programming
> > > >     - The ISP and VSPX job queues have to stay in sync
> > > >
> > > > - Minors
> > > >   - Remove the jobs_lock as a single lock is fine
> > > >   - Protect against VSPX/ISP irq races in job_run() by checking the
> > > >     VSPX 'busy' register and remove the 'processing' flag
> > > >   - Manually set the pipeline state to STOPPED before scheduling a new
> > > >     job without waiting for frame_end
> > > >
> > > > Changes in v7:
> > > > - Include VSPX driver in the series
> > > > - Use existing VSP1 formats and remove patches extending formats on RPF
> > > > - Rework VSPX driver to split jobs creation and scheduling in two
> > > >   different API entry points
> > > > - Fix VSPX stride using the user provided bytesperline and using the
> > > >   buffer size for ConfigDMA buffers
> > > > - Link to v6: https://lore.kernel.org/r/20250321-v4h-iif-v6-0-361e9043026a@ideasonboard.com
> > > >
> > > > Changes in v6:
> > > > - Little cosmetic change as suggested by Laurent
> > > > - Collect tags
> > > > - Link to v5: https://lore.kernel.org/r/20250319-v4h-iif-v5-0-0a10456d792c@ideasonboard.com
> > > >
> > > > Changes in v5:
> > > > - Drop additional empty line 5/6
> > > > - Link to v4: https://lore.kernel.org/r/20250318-v4h-iif-v4-0-10ed4c41c195@ideasonboard.com
> > > >
> > > > Changes in v4:
> > > > - Fix SWAP bits for RAW10, RAW12 and RAW16
> > > > - Link to v3: https://lore.kernel.org/r/20250317-v4h-iif-v3-0-63aab8982b50@ideasonboard.com
> > > >
> > > > Changes in v3:
> > > > - Drop 2/6 from v2
> > > > - Add 5/7 to prepare for a new implementation of 6/7
> > > > - Individual changelog per patch
> > > > - Add 7/7
> > > > - Link to v2: https://lore.kernel.org/r/20250224-v4h-iif-v2-0-0305e3c1fe2d@ideasonboard.com
> > > >
> > > > Changes in v2:
> > > > - Collect tags
> > > > - Address review comments from Laurent, a lot of tiny changes here and
> > > >   there but no major redesign worth an entry in the patchset changelog
> > >
> > > You've come a long way since v1, I think we're getting close to a good
> > > implementation.
> >
> > Thank you and Niklas for reviews and testing
> >
> > > > ---
> > > >  drivers/media/platform/renesas/vsp1/Makefile    |   1 +
> > > >  drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
> > > >  drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
> > > >  drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
> > > >  drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
> > > >  drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
> > > >  include/media/vsp1.h                            |  75 +++
> > > >  7 files changed, 770 insertions(+), 1 deletion(-)
>
> [snip]
>
> > > > diff --git a/drivers/media/platform/renesas/vsp1/vsp1_vspx.c b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > > > new file mode 100644
> > > > index 000000000000..6edb5e4929d9
> > > > --- /dev/null
> > > > +++ b/drivers/media/platform/renesas/vsp1/vsp1_vspx.c
> > > > @@ -0,0 +1,664 @@
>
> [snip]
>
> > > > +static int vsp1_vspx_rpf0_configure(struct vsp1_device *vsp1,
> > > > +				    dma_addr_t addr, u32 isp_fourcc,
> > > > +				    unsigned int width, unsigned int height,
> > > > +				    unsigned int stride,
> > > > +				    unsigned int iif_sink_pad,
> > > > +				    struct vsp1_dl_list *dl,
> > > > +				    struct vsp1_dl_body *dlb)
> > > > +{
> > > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > > +	struct vsp1_rwpf *rpf0 = pipe->inputs[0];
> > > > +	int ret;
> > > > +
> > > > +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, rpf0, isp_fourcc, width,
> > > > +					    height);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	ret = vsp1_vspx_rwpf_set_subdev_fmt(vsp1, pipe->output, isp_fourcc,
> > > > +					    width, height);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	vsp1_pipeline_calculate_partition(pipe, &pipe->part_table[0],
> > > > +					  width, 0);
> > > > +
> > >
> > > You should also set rwpf->format.num_planes to 1 here.
> >
> > ack
> >
> > > > +	rpf0->format.plane_fmt[0].bytesperline = stride;
> > > > +
> > > > +	/*
> > > > +	 * Connect RPF0 to the IIF sink pad corresponding to the config or image
> > > > +	 * path.
> > > > +	 */
> > > > +	rpf0->entity.sink_pad = iif_sink_pad;
> > > > +
> > > > +	pipe->part_table[0].rpf[0].width = width;
> > > > +	pipe->part_table[0].rpf[0].height = height;
> > >
> > > Isn't this handled by vsp1_pipeline_calculate_partition() ?
> >
> > I don't see it happening in the vsp1_pipeline_calculate_partition()
> > call chain...
> >
> > vsp1_pipeline_calculate_partition() calls
> > vsp1_pipeline_propagate_partition() which calls the 'partition' op on
> > entities in the pipeline.
> >
> > The RPF implementation of the 'partition' op however initializes the
> > crop retangles on the entity, but not the part_table[].
>
> The RPF .partition() handler is implemented as
>
> static void rpf_partition(struct vsp1_entity *entity,
> 			  struct v4l2_subdev_state *state,
> 			  struct vsp1_pipeline *pipe,
> 			  struct vsp1_partition *partition,
> 			  unsigned int partition_idx,
> 			  struct v4l2_rect *window)
> {
> 	struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
> 	struct v4l2_rect *rpf_rect = &partition->rpf[rpf->entity.index];
>
> 	/*
> 	 * Partition Algorithm Control
> 	 *
> 	 * The partition algorithm can split this frame into multiple slices. We
> 	 * must adjust our partition window based on the pipe configuration to
> 	 * match the destination partition window. To achieve this, we adjust
> 	 * our crop to provide a 'sub-crop' matching the expected partition
> 	 * window.
> 	 */
> 	*rpf_rect = *v4l2_subdev_state_get_crop(state, RWPF_PAD_SINK);
>
> 	if (pipe->partitions > 1) {
> 		rpf_rect->width = window->width;
> 		rpf_rect->left += window->left;
> 	}
> }
>
> The partition argument to the function is &pipe->part_table[0], so
> rpf_rect is pipe->part_table[0].rpf[0].
>
> Could you check if the values of pipe->part_table[0].rpf[0].width and
> pipe->part_table[0].rpf[0].height match width and height after you call
> vsp1_pipeline_calculate_partition() ?

Yes they do

I will drop the assignment of width and height

>
> > To be honest that partition part is still a bit obscure to me, so I
> > might be missing something indeed
> >
> > > > +
> > > > +	rpf0->mem.addr[0] = addr;
> > > > +	rpf0->mem.addr[1] = 0;
> > > > +	rpf0->mem.addr[2] = 0;
> > > > +
> > > > +	vsp1_entity_route_setup(&rpf0->entity, pipe, dlb);
> > > > +	vsp1_entity_configure_stream(&rpf0->entity, rpf0->entity.state, pipe,
> > > > +				     dl, dlb);
> > > > +	vsp1_entity_configure_partition(&rpf0->entity, pipe,
> > > > +					&pipe->part_table[0], dl, dlb);
> > > > +
> > > > +	return 0;
> > > > +}
>
> [snip]
>
> > > > +/**
> > > > + * vsp1_isp_start_streaming - Start processing VSPX jobs
> > > > + *
> > > > + * Start the VSPX and prepare for accepting buffer transfer job requests.
> > > > + *
> > > > + * @dev: The VSP1 struct device
> > > > + * @frame_end: The frame end callback description
> > > > + *
> > > > + * Return: %0 on success or a negative error code on failure
> > > > + */
> > > > +int vsp1_isp_start_streaming(struct device *dev,
> > > > +			     struct vsp1_vspx_frame_end *frame_end)
> > > > +{
> > > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > > +	unsigned long flags;
> > > > +	u32 value;
> > > > +	int ret;
> > > > +
> > > > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> > >
> > > Can this function ever be called with interrupts disabled ? If no, you
> > > can use spin_lock_irq().
> >
> > I think the question here should rather be: do we need to disable
> > local interrupts at all when calling this function ? As the VSPX
> > workflow is now driven by ISP and there are no contentions between any
> > of the driver functions and the VSPX interrupt handler I guess I can
> > use spin_lock() and spin_unlock() everywhere and replace the
> > irqsave/irqrestore versions ?
>
> It depends what the spinlock protects.
>
> The point of the _irq and _irqsave variants is to prevent deadlocks. If
> a code section protected by a spinlock is interrupted by an IRQ handler
> on the same CPU, and the IRQ handler tries to take the same lock, a
> deadlock will occur: the IRQ handler will spin on the lock to wait until
> it gets released, but the lock will never get released as the code that
> took it has been preempted by the IRQ handler. Note that such an IRQ
> handler running on a different CPU is not an issue, it will spin on the
> lock, but the lock will be released by the first CPU running in parallel
> to the IRQ handler.
>
> To solve that problem, the spin_lock_irq() and spin_unlock_irq()
> functions respectively disable and enable local interrupts (i.e.
> interrupts on the same CPU, not interrupts on other CPUs). They should
> be called in contexts where interrupts are enabled (so not in an IRQ
> handler, and not in a section covered by another spin_lock_irq() call or
> uby other calls that disable local interrupts), and will ensure that no
> IRQ handler can preempt the CPU.
>
> Sometimes, you don't know if your code is running in a context where
> interrupts are enabled or not. This is the case for instance in helper
> functions that can be used in either contects. Using spin_lock_irq() is
> safe there (disabling local interrupts when they are already disabled is
> a no-op), but spin_unlock_irq() will enable local interrupts, which
> would lead to problems. For these cases, spin_lock_irqsave() will save
> the state of local interrupts and disable them, and
> spin_unlock_irqrestore() will restore the local interrupts state to what
> it was before spin_lock_irqsave().
>
> It is safe to use spin_lock_irqsave() and spin_unlock_irqrestore()
> unconditionally, but it has a performance impact. Using the right
> variants isn't very difficult. Once you identify the data protected by
> the lock, you need to look at where the data is accessed.
>
> - If the data is accessed only in non-interrupt context, or only in
>   interrupt context, use spin_lock() and spin_unlock() everywhere.
>
> - If the data is accessed in both contexts, use
>
>   - spin_lock() and spin_unlock() in code that runs only in contexts
>     where local interrupts are disabled
>
>   - spin_lock_irq() and spin_unlock_irq() in code that runs only in
>     contexts where local interrupts are enabled
>
>   - spin_lock_irqsave() and spin_unlock_irqrestore() in code that can
>     run in either context
>
> The vspx_lock is documented as "protect access to the VSPX configuration
> and the jobs queue". That's quite vague, so the first thing you'll have
> to do is to list the exact pieces of data that the lock protects, and
> then decide on the locking scheme.
>

With the new design I think I can drop the _irq/_irqsave version from
most places but the VSP1 interrupt handler callback. I will have a
look

> > > > +
> > > > +	if (vspx_pipe->enabled) {
> > > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > > +		return 0;
> > >
> > > Shouldn't this be an error ?
> >
> > Should it ? I don't think it's an error conceptually, as the ISP
> > driver is currently designed however a single call to this function
> > should happen, so I can flag this as an error.
> >
> > However, as S_STREAM(1) is called on every video node, I would not
> > rule out a design where each video dev tries to start the VSPX. The
> > ones that arrive late will just hit a nop.
>
> The API exposed by the vsp driver to the ISP driver needs to be designed
> and clearly documented. There are different design options, but the API
> should not rely on a particular call pattern from the ISP driver without
> documenting it. As the documentation above doesn't indicate what the
> call pattern is, I can't tell if the design is good or not.
>
> This being said, if there are multiple userspace-facing devices that all
> need to be started, someone will have to reference-count the VSP
> start/stop state. Doing so in the vsp driver seems to add complexity
> without any real benefit. I think that handling this in the ISP driver
> would be better. I recommend making calling vsp1_isp_start_streaming()
> on an already enabled VSP instance invalid, and returning an error. Same
> when stopping the VSP, the ISP driver should decide when the VSP should
> be stopped, and should not call vsp1_isp_stop_streaming() on an already
> stopped instance.
>

I'll make this an error and specify that the function shall be called
once only in the documentation.

> > > > +	}
> > > > +
> > > > +	if (!frame_end) {
> > >
> > > You can move this check before taking the lock.
> >
> > True that
> >
> > > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	vspx_pipe->vspx_frame_end = frame_end->vspx_frame_end;
> > > > +	vspx_pipe->frame_end_data = frame_end->frame_end_data;
> > >
> > > Move this just before setting ->enabled to true, so you won't override
> > > the values if the function returns an error in the checks below.
> > >
> > > > +
> > > > +	/* Make sure VSPX is not active. */
> > >
> > > This should never happen unless there's a serious bug in the driver,
> > > right ? I would make that explicit in the comment:
> > >
> > > 	/*
> > > 	 * Make sure VSPX is not active. This should never happen in normal
> > > 	 * usage.
> > > 	 */
> >
> > ok
> >
> > > > +	value = vsp1_read(vsp1, VI6_CMD(0));
> > > > +	if (value & VI6_CMD_STRCMD) {
> > > > +		dev_err(vsp1->dev,
> > > > +			"%s: Starting of WPF0 already reserved\n", __func__);
> > > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > > +		return -EBUSY;
> > > > +	}
> > > > +
> > > > +	value = vsp1_read(vsp1, VI6_STATUS);
> > > > +	if (value & VI6_STATUS_SYS_ACT(0)) {
> > > > +		dev_err(vsp1->dev,
> > > > +			"%s: WPF0 has not entered idle state\n", __func__);
> > > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > > +		return -EBUSY;
> > > > +	}
> > > > +
> > > > +	vspx_pipe->enabled = true;
> > > > +
> > > > +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > >
> > > and close the guarded scope here
> > >
> > > 	}
> >
> > I found it to be less readable because of the increased scoping.
> >
> > > > +
> > > > +	/* Enable the VSP1 and prepare for streaming. */
> > > > +	vsp1_pipeline_dump(pipe, "VSPX job");
> > > > +
> > > > +	ret = vsp1_device_get(vsp1);
> > >
> > > This should be done before you read the registers.
> >
> > Ah right. I guess this function should be then re-designed.
> >
> > > > +	if (ret < 0) {
> > > > +		guard(spinlock_irqsave)(&vspx_pipe->vspx_lock);
> > > > +		vspx_pipe->enabled = false;
> > > > +		return ret;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(vsp1_isp_start_streaming);
> > > > +
> > > > +/**
> > > > + * vsp1_isp_stop_streaming - Stop the VSPX
> > > > + *
> > > > + * @dev: The VSP1 struct device
> > > > + */
> > > > +void vsp1_isp_stop_streaming(struct device *dev)
> > > > +{
> > > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > > +	unsigned long flags;
> > > > +
> > > > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> > > > +
> > > > +	if (!vspx_pipe->enabled) {
> > > > +		spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > > > +		return;
> > > > +	}
> > >
> > > 	scoped_guard(spinlock_irqsave)(&vspx_pipe->vspx_lock, flags) {
> > > 		if (!vspx_pipe->enabled)
> > > 			return;
> > >
> > > 		...
> > >
> > > (or spinlock_irq, see comment in vsp1_isp_start_streaming()).
> >
> > Am I mistaken that the _irq version you suggest still disables
> > local interrupts, which is something I shouldn't need
> >
> > -------------------------------------------------------------------------------
> > From "Documentation/kernel-hacking/locking.rst"
> >
> > If a hardware irq handler shares data with a softirq, you have two
> > concerns. Firstly, the softirq processing can be interrupted by a
> > hardware interrupt, and secondly, the critical region could be entered
> > by a hardware interrupt on another CPU. This is where
> > spin_lock_irq() is used. It is defined to disable
> > interrupts on that cpu, then grab the lock.
> > spin_unlock_irq() does the reverse.
> > -------------------------------------------------------------------------------
> >
> > I guess I should simply use spin_lock/spin_unlock
>
> See above.
>
> > > > +
> > > > +	vspx_pipe->enabled = false;
> > > > +
> > > > +	pipe->state = VSP1_PIPELINE_STOPPED;
> > > > +	vsp1_pipeline_stop(pipe);
> > >
> > > You can't call this with a spin lock held. We can look together at how
> > > to handle locking in this driver, and use a mutex instead of a spinlock
> > > (I know you went the other way around).
> >
> > Actually my only concern with mutexes is that they can't be called by
> > the ISP driver while it holds a spinlock. With this new version the
> > only function that would need to be called with a spinlock taken is
> > vsp1_isp_job_run() and vsp1_isp_jobs_release() which only contends the
> > jobs_queue with the rest of the functions of the VSPX driver.
> >
> > I could use a mutex as a main lock here and a spinlock to protect the
> > jobs queue.
>
> I think you'll need to do that, using a mutex to protect the whole
> start/stop functions, and a spinlock inside for specific fields of the
> vsp1_vspx_pipeline structure that are also accessed in the interrupt
> handler. Looking at v10, there's a race condition between start and stop
> due to not covering the whole functions with a mutex.

I'll need a mutex to protect the whole start/stop streaming paths yes.

However I still need to take a spinlock in the stop_streaming path as
there's a contention on vspx_pipe->enabled between stop_streaming and
job run.

>
> > Or I could simply call the vsp1_pipeline_stop() outside of the
> > critical section here.
> >
> > > > +	vspx_pipe->vspx_frame_end = NULL;
> > > > +	vsp1_dlm_reset(pipe->output->dlm);
> > > > +
> > > > +	spin_unlock_irqrestore(&vspx_pipe->vspx_lock, flags);
> > >
> > > 	}
> > >
> > > > +
> > > > +	vsp1_device_put(vsp1);
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(vsp1_isp_stop_streaming);
> > > > +
> > > > +/**
> > > > + * vsp1_vspx_job_prepare - Prepare a new buffer transfer job
> > > > + *
> > > > + * Prepare a new buffer transfer job by populating a display list that will be
> > > > + * later executed by a call to vsp1_isp_job_run().
> > > > + *
> > > > + * The newly created job is appended to the queue of pending jobs. All pending
> > > > + * jobs must be released after stopping the streaming operations with a call
> > > > + * to vsp1_isp_jobs_release().
> > > > + *
> > > > + * @dev: The VSP1 struct device
> > > > + * @job: The job description
> > > > + *
> > > > + * Return: %0 on success or a negative error code on failure
> > > > + */
> > > > +int vsp1_isp_job_prepare(struct device *dev,
> > > > +			 const struct vsp1_isp_job_desc *desc)
> > > > +{
> > > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > > +	const struct v4l2_pix_format_mplane *pix_mp;
> > > > +	struct vsp1_dl_list *second_dl = NULL;
> > > > +	struct vsp1_vspx_job *job;
> > > > +	struct vsp1_dl_body *dlb;
> > > > +	struct vsp1_dl_list *dl;
> > > > +	int ret;
> > > > +
> > > > +	/* Memory will be released when the job is consumed. */
> > > > +	job = kmalloc(sizeof(*job), GFP_KERNEL);
> > > > +	if (!job)
> > > > +		return -ENOMEM;
> > >
> > > Hmmm... I think this can be improved. I would need to see how the ISP
> > > driver uses this API, but I'm thinking the job data could be stored in
> > > vsp1_isp_job_desc, without adding the job to a queue in this function.
> > > The job would then be passed to the run function, which would queue it.
> > > I think the resulting API would be cleaner. You will need an extra
> > > function to delete/destroy/cleanup a job that hasn't been queued (the
> > > counterpart of this function, essentially).
> >
> > I'm not sure I got this to be honest.
> >
> > A VSPX job is basically just a display list + a list handle.
> > Are you suggesting add the display list to vsp1_isp_job_desc and keep
> > a queue of jobs on the ISP side only ? Then pass the dl in to
> > vsp1_isp_job_run() ? This would avoid maintaing a queue of jobs on the
> > VSPX side at all most probably and save me the hassle to cleanup the
> > jobs queue on the VSPX side.
> >
> > I'll see how this could look.
>
> The implementation in v10 looks nicer.
>
> > > > +
> > > > +	/*
> > > > +	 * Transfer the buffers described in the job: an optional ConfigDMA
> > > > +	 * parameters buffer and a RAW image.
> > > > +	 */
> > > > +
> > > > +	job->dl = vsp1_dl_list_get(pipe->output->dlm);
> > > > +	if (!job->dl) {
> > > > +		ret = -ENOMEM;
> > > > +		goto error_free_job;
> > > > +	}
> > > > +
> > > > +	dl = job->dl;
> > > > +	dlb = vsp1_dl_list_get_body0(dl);
> > > > +
> > > > +	/* Disable RPF1. */
> > > > +	vsp1_dl_body_write(dlb, vsp1->rpf[1]->entity.route->reg,
> > > > +			   VI6_DPR_NODE_UNUSED);
> > >
> > > The route is disabled in vsp1_device_init(), and we never use RPF1. I
> > > think this can be dropped.
> >
> > ack
> >
> > > > +
> > > > +	/* Configure IIF routing and enable IIF function */
> > >
> > > s/function/function./
> > >
> > > > +	vsp1_entity_route_setup(pipe->iif, pipe, dlb);
> > > > +	vsp1_entity_configure_stream(pipe->iif, pipe->iif->state, pipe,
> > > > +				     dl, dlb);
> > > > +
> > > > +	/* Configure WPF0 to enable RPF0 as source*/
> > >
> > > s/source/source. /
> > >
> > > > +	vsp1_entity_route_setup(&pipe->output->entity, pipe, dlb);
> > > > +	vsp1_entity_configure_stream(&pipe->output->entity,
> > > > +				     pipe->output->entity.state, pipe,
> > > > +				     dl, dlb);
> > >
> > > I'm kind of tempting to call route_setup, configure_stream,
> > > configure_frame and configure_partition by iterating over the entities
> > > in the pipe, as done by vsp1_du_pipeline_configure(). The code would be
> > > more generic, at the cost of a few calls being duplicated between the
> > > config DMA and image configurations. What do you think ?
> >
> > The fact that the RPF0 is optionally configurable for ConfigDMA
> >
> > > > +
> > > > +	if (desc->config.pairs) {
> >
> > here
> >
> > makes it hard for me to see how this could be done cleanly.
> >
> > To be honest, the pipeline is fixed (rpf->iif->wpf) so I don't see
> > much benefits here
>
> Agreed.
>
> > > > +		/*
> > > > +		 * Configure RPF0 for config data. Transfer the number of
> > > > +		 * configuration pairs plus 2 words for the header.
> > > > +		 */
> > > > +		ret = vsp1_vspx_rpf0_configure(vsp1, desc->config.mem,
> > > > +					       V4L2_META_FMT_GENERIC_8,
> > > > +					       desc->config.pairs * 2 + 2, 1,
> > > > +					       desc->config.pairs * 2 + 2,
> > > > +					       VSPX_IIF_SINK_PAD_CONFIG,
> > > > +					       dl, dlb);
> > > > +		if (ret)
> > > > +			goto error_put_dl;
> > > > +
> > > > +		second_dl = vsp1_dl_list_get(pipe->output->dlm);
> > > > +		if (!second_dl) {
> > > > +			ret = -ENOMEM;
> > > > +			goto error_put_dl;
> > > > +		}
> > > > +
> > > > +		dl = second_dl;
> > > > +		dlb = vsp1_dl_list_get_body0(dl);
> > > > +	}
> > > > +
> > > > +	/* Configure RPF0 for RAW image transfer. */
> > > > +	pix_mp = &desc->img.fmt.fmt.pix_mp;
> > > > +	ret = vsp1_vspx_rpf0_configure(vsp1, desc->img.mem,
> > > > +				       pix_mp->pixelformat,
> > > > +				       pix_mp->width, pix_mp->height,
> > > > +				       pix_mp->plane_fmt[0].bytesperline,
> > > > +				       VSPX_IIF_SINK_PAD_IMG, dl, dlb);
> > > > +	if (ret)
> > > > +		goto error_put_dl;
> > > > +
> > > > +	if (second_dl)
> > > > +		vsp1_dl_list_add_chain(job->dl, second_dl);
> > > > +
> > > > +	scoped_guard(spinlock_irqsave, &vspx_pipe->vspx_lock) {
> > > > +		list_add_tail(&job->job_queue, &vspx_pipe->jobs);
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +
> > > > +error_put_dl:
> > > > +	if (second_dl)
> > > > +		vsp1_dl_list_put(second_dl);
> > > > +	vsp1_dl_list_put(job->dl);
> > > > +error_free_job:
> > > > +	kfree(job);
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(vsp1_isp_job_prepare);
>
> [snip]
>
> --
> Regards,
>
> Laurent Pinchart
Re: [PATCH v8] media: vsp1: Add VSPX support
Posted by Jacopo Mondi 9 months, 1 week ago
On Mon, May 05, 2025 at 10:19:57AM +0200, Jacopo Mondi wrote:
> Hi Laurent
>
> On Fri, May 02, 2025 at 11:26:44PM +0300, Laurent Pinchart wrote:
> > Hi Jacopo,
> >
> > Thank you for the patch.
> >
> > On Fri, May 02, 2025 at 03:23:10PM +0200, Jacopo Mondi wrote:
> > > Add support for VSPX, a specialized version of the VSP2 that
> > > transfers data to the ISP. The VSPX is composed of two RPF units
> > > to read data from external memory and an IIF instance that performs
> > > transfer towards the ISP.
> > >
> > > The VSPX is supported through a newly introduced vsp1_vspx.c file that
> > > exposes two interfaces: vsp1_vspx interface, declared in vsp1_vspx.h
> > > for the vsp1 core to initialize and cleanup the VSPX, and a vsp1_isp
> > > interface, declared in include/media/vsp1.h for the ISP driver to
> > > control the VSPX operations.
> > >
> > > Signed-off-by: Jacopo Mondi <jacopo.mondi+renesas@ideasonboard.com>
> > > ---
> > > The VSPX is a VSP2 function that reads data from external memory using
> > > two RPF instances and feed it to the ISP.
> > >
> > > The VSPX includes an IIF unit (ISP InterFace) modeled in the vsp1 driver
> > > as a new, simple, entity type.
> > >
> > > IIF is part of VSPX, a version of the VSP2 IP specialized for ISP
> > > interfacing. To prepare to support VSPX, support IIF first by
> > > introducing a new entity and by adjusting the RPF/WPF drivers to
> > > operate correctly when an IIF is present.
> > >
> > > Changes in v8:
> > > - Remove patches already collected by Laurent in
> > >   [GIT PULL FOR v6.16] Renesas media drivers changes
> > >
> > > - Rebased on
> > >   https://gitlab.freedesktop.org/linux-media/users/pinchartl.git #renesas-next
> > >
> > > - Changes to the VSPX interface towards the ISP
> > >   - Split start/stop_streaming
> > >   - Add vsp1_isp_jobs_release() to release pending jobs
> > >   - Add vsp1_isp_free_buffer()
> > >   - Remove vsp1_isp_configure() and compute partitions on job creation
> > >
> > > - Driver changes
> > >   - Drop irq-driver flow
> > >     The VSPX used to schedule new jobs as soon as processing the last
> > >     one is done. This doesn't work well with the R-Car ISP design
> > >     for two reasons:
> > >     - The ISP needs per-job registers programming
> > >     - The ISP and VSPX job queues have to stay in sync
> > >
> > > - Minors
> > >   - Remove the jobs_lock as a single lock is fine
> > >   - Protect against VSPX/ISP irq races in job_run() by checking the
> > >     VSPX 'busy' register and remove the 'processing' flag
> > >   - Manually set the pipeline state to STOPPED before scheduling a new
> > >     job without waiting for frame_end
> > >
> > > Changes in v7:
> > > - Include VSPX driver in the series
> > > - Use existing VSP1 formats and remove patches extending formats on RPF
> > > - Rework VSPX driver to split jobs creation and scheduling in two
> > >   different API entry points
> > > - Fix VSPX stride using the user provided bytesperline and using the
> > >   buffer size for ConfigDMA buffers
> > > - Link to v6: https://lore.kernel.org/r/20250321-v4h-iif-v6-0-361e9043026a@ideasonboard.com
> > >
> > > Changes in v6:
> > > - Little cosmetic change as suggested by Laurent
> > > - Collect tags
> > > - Link to v5: https://lore.kernel.org/r/20250319-v4h-iif-v5-0-0a10456d792c@ideasonboard.com
> > >
> > > Changes in v5:
> > > - Drop additional empty line 5/6
> > > - Link to v4: https://lore.kernel.org/r/20250318-v4h-iif-v4-0-10ed4c41c195@ideasonboard.com
> > >
> > > Changes in v4:
> > > - Fix SWAP bits for RAW10, RAW12 and RAW16
> > > - Link to v3: https://lore.kernel.org/r/20250317-v4h-iif-v3-0-63aab8982b50@ideasonboard.com
> > >
> > > Changes in v3:
> > > - Drop 2/6 from v2
> > > - Add 5/7 to prepare for a new implementation of 6/7
> > > - Individual changelog per patch
> > > - Add 7/7
> > > - Link to v2: https://lore.kernel.org/r/20250224-v4h-iif-v2-0-0305e3c1fe2d@ideasonboard.com
> > >
> > > Changes in v2:
> > > - Collect tags
> > > - Address review comments from Laurent, a lot of tiny changes here and
> > >   there but no major redesign worth an entry in the patchset changelog
> >
> > You've come a long way since v1, I think we're getting close to a good
> > implementation.
> >
>
> Thank you and Niklas for reviews and testing
>
> > > ---
> > >  drivers/media/platform/renesas/vsp1/Makefile    |   1 +
> > >  drivers/media/platform/renesas/vsp1/vsp1.h      |   1 +
> > >  drivers/media/platform/renesas/vsp1/vsp1_drv.c  |  13 +-
> > >  drivers/media/platform/renesas/vsp1/vsp1_regs.h |   1 +
> > >  drivers/media/platform/renesas/vsp1/vsp1_vspx.c | 664 ++++++++++++++++++++++++
> > >  drivers/media/platform/renesas/vsp1/vsp1_vspx.h |  16 +
> > >  include/media/vsp1.h                            |  75 +++
> > >  7 files changed, 770 insertions(+), 1 deletion(-)
> > >

[snip]

> > > +
> > > +/**
> > > + * vsp1_isp_start_streaming - Start processing VSPX jobs
> > > + *
> > > + * Start the VSPX and prepare for accepting buffer transfer job requests.
> > > + *
> > > + * @dev: The VSP1 struct device
> > > + * @frame_end: The frame end callback description
> > > + *
> > > + * Return: %0 on success or a negative error code on failure
> > > + */
> > > +int vsp1_isp_start_streaming(struct device *dev,
> > > +			     struct vsp1_vspx_frame_end *frame_end)
> > > +{
> > > +	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
> > > +	struct vsp1_vspx_pipeline *vspx_pipe = &vsp1->vspx->pipe;
> > > +	struct vsp1_pipeline *pipe = &vspx_pipe->pipe;
> > > +	unsigned long flags;
> > > +	u32 value;
> > > +	int ret;
> > > +
> > > +	spin_lock_irqsave(&vspx_pipe->vspx_lock, flags);
> >
> > Can this function ever be called with interrupts disabled ? If no, you
> > can use spin_lock_irq().
> >
>
> I think the question here should rather be: do we need to disable
> local interrupts at all when calling this function ? As the VSPX
> workflow is now driven by ISP and there are no contentions between any
> of the driver functions and the VSPX interrupt handler I guess I can
> use spin_lock() and spin_unlock() everywhere and replace the
> irqsave/irqrestore versions ?
>

I take this back. Using either spin_lock and spin_lock_irq trigger
lockdep warnings due to the usage of these function from ISP irq
context.

To be honest I spent a considerable amount of time making all of this
lockdep proof and I'll keep using the irqsave version for the time
being considering the limited overhead compared to _irq.

> > > +
> > > +	if (vspx_pipe->enabled) {