[RFC PATCH 0/3] Implement Region of Interest(ROI) support.

Deepa Guthyappa Madivalara posted 3 patches 3 weeks, 5 days ago
.../userspace-api/media/v4l/ext-ctrls-codec.rst    |  7 +++
drivers/media/platform/qcom/iris/iris_ctrls.c      | 54 +++++++++++++++++++++-
drivers/media/platform/qcom/iris/iris_ctrls.h      |  1 +
.../platform/qcom/iris/iris_platform_common.h      |  4 ++
.../media/platform/qcom/iris/iris_platform_gen2.c  |  8 ++++
drivers/media/v4l2-core/v4l2-ctrls-core.c          | 14 +++++-
drivers/media/v4l2-core/v4l2-ctrls-defs.c          |  5 ++
include/media/v4l2-ctrls.h                         |  1 +
include/uapi/linux/v4l2-controls.h                 |  1 +
include/uapi/linux/videodev2.h                     | 17 +++++++
10 files changed, 110 insertions(+), 2 deletions(-)
[RFC PATCH 0/3] Implement Region of Interest(ROI) support.
Posted by Deepa Guthyappa Madivalara 3 weeks, 5 days ago
Hi all,

This patch set implements region of interest(ROI) support
for video encoder to be configured as a rectangular
region, and corresponding delta QP parameter. A new compound
control V4L2_CID_MPEG_VIDEO_ENC_ROI which maps to struct
v4l2_ctrl_enc_roi_params is implemented to achieve this.  

I'm sharing this series as an RFC because adding support
in the firmware and framework for testing, gstreamer testing
is still in progress. I would appreciate early feedback on
the design, implementation, and fixes before moving to a
formal submission.

v4l2-ctl -d /dev/video1 --list-ctrls
..
hevc_b_frame_maximum_qp_value 0x00990b8c (int): min=1 max=51 step=1
default=51 value=51 flags=has-min-max
video_encoder_roi_params 0x00990b92 (unknown): type=284
value=unsupported payload type flags=has-payload

Thanks,
Deepa

Signed-off-by: Deepa Guthyappa Madivalara <deepa.madivalara@oss.qualcomm.com>
---
Deepa Guthyappa Madivalara (3):
      media: uapi: Introduce new control for video encoder ROI
      media: v4l2-core: Add support for video encoder ROI control
      media: iris: Add ROI support framework for video encoder

 .../userspace-api/media/v4l/ext-ctrls-codec.rst    |  7 +++
 drivers/media/platform/qcom/iris/iris_ctrls.c      | 54 +++++++++++++++++++++-
 drivers/media/platform/qcom/iris/iris_ctrls.h      |  1 +
 .../platform/qcom/iris/iris_platform_common.h      |  4 ++
 .../media/platform/qcom/iris/iris_platform_gen2.c  |  8 ++++
 drivers/media/v4l2-core/v4l2-ctrls-core.c          | 14 +++++-
 drivers/media/v4l2-core/v4l2-ctrls-defs.c          |  5 ++
 include/media/v4l2-ctrls.h                         |  1 +
 include/uapi/linux/v4l2-controls.h                 |  1 +
 include/uapi/linux/videodev2.h                     | 17 +++++++
 10 files changed, 110 insertions(+), 2 deletions(-)
---
base-commit: f417b7ffcbef7d76b0d8860518f50dae0e7e5eda
change-id: 20260112-iris_enc_roi-8898f9a2455f

Best regards,
-- 
Deepa Guthyappa Madivalara <deepa.madivalara@oss.qualcomm.com>
Re: [RFC PATCH 0/3] Implement Region of Interest(ROI) support.
Posted by Nicolas Dufresne 3 weeks, 4 days ago
Hi,

Le mardi 13 janvier 2026 à 12:33 -0800, Deepa Guthyappa Madivalara a écrit :
> Hi all,
> 
> This patch set implements region of interest(ROI) support
> for video encoder to be configured as a rectangular
> region, and corresponding delta QP parameter. A new compound
> control V4L2_CID_MPEG_VIDEO_ENC_ROI which maps to struct
> v4l2_ctrl_enc_roi_params is implemented to achieve this.  

My very first question will be why ROI rather then QP Map ? Its seems that
modern API such as D3D12 and Vulkan Video aims for QP Map instead of a limited
set of rectangles, while older hardware / firmware have ROI, but since you are
saying that this is not yet implemented in your firmware, I thought it was worth
asking.

The ROI are relatively easy to convert into QP Maps, but the opposite is going
to be a lot less accurate. That being said, the number of ROI can be extremely
limited, at least this is the case for Samsung MFC firmware and Hantro encoders
(no upstream driver yet).

let us know your thought, should we adopt just one, and have driver translate
once HW moved to the new approach ? Should we enventually support both ?

Nicolas

> 
> I'm sharing this series as an RFC because adding support
> in the firmware and framework for testing, gstreamer testing
> is still in progress. I would appreciate early feedback on
> the design, implementation, and fixes before moving to a
> formal submission.
> 
> v4l2-ctl -d /dev/video1 --list-ctrls
> ..
> hevc_b_frame_maximum_qp_value 0x00990b8c (int): min=1 max=51 step=1
> default=51 value=51 flags=has-min-max
> video_encoder_roi_params 0x00990b92 (unknown): type=284
> value=unsupported payload type flags=has-payload
> 
> Thanks,
> Deepa
> 
> Signed-off-by: Deepa Guthyappa Madivalara <deepa.madivalara@oss.qualcomm.com>
> ---
> Deepa Guthyappa Madivalara (3):
>       media: uapi: Introduce new control for video encoder ROI
>       media: v4l2-core: Add support for video encoder ROI control
>       media: iris: Add ROI support framework for video encoder
> 
>  .../userspace-api/media/v4l/ext-ctrls-codec.rst    |  7 +++
>  drivers/media/platform/qcom/iris/iris_ctrls.c      | 54
> +++++++++++++++++++++-
>  drivers/media/platform/qcom/iris/iris_ctrls.h      |  1 +
>  .../platform/qcom/iris/iris_platform_common.h      |  4 ++
>  .../media/platform/qcom/iris/iris_platform_gen2.c  |  8 ++++
>  drivers/media/v4l2-core/v4l2-ctrls-core.c          | 14 +++++-
>  drivers/media/v4l2-core/v4l2-ctrls-defs.c          |  5 ++
>  include/media/v4l2-ctrls.h                         |  1 +
>  include/uapi/linux/v4l2-controls.h                 |  1 +
>  include/uapi/linux/videodev2.h                     | 17 +++++++
>  10 files changed, 110 insertions(+), 2 deletions(-)
> ---
> base-commit: f417b7ffcbef7d76b0d8860518f50dae0e7e5eda
> change-id: 20260112-iris_enc_roi-8898f9a2455f
> 
> Best regards,
Re: [RFC PATCH 0/3] Implement Region of Interest(ROI) support.
Posted by Deepa Guthyappa Madivalara 3 weeks, 4 days ago
On 1/14/2026 8:08 AM, Nicolas Dufresne wrote:
> Hi,
>
> Le mardi 13 janvier 2026 à 12:33 -0800, Deepa Guthyappa Madivalara a écrit :
>> Hi all,
>>
>> This patch set implements region of interest(ROI) support
>> for video encoder to be configured as a rectangular
>> region, and corresponding delta QP parameter. A new compound
>> control V4L2_CID_MPEG_VIDEO_ENC_ROI which maps to struct
>> v4l2_ctrl_enc_roi_params is implemented to achieve this.
> My very first question will be why ROI rather then QP Map ? Its seems that
> modern API such as D3D12 and Vulkan Video aims for QP Map instead of a limited
> set of rectangles, while older hardware / firmware have ROI, but since you are
> saying that this is not yet implemented in your firmware, I thought it was worth
> asking.
>
> The ROI are relatively easy to convert into QP Maps, but the opposite is going
> to be a lot less accurate. That being said, the number of ROI can be extremely
> limited, at least this is the case for Samsung MFC firmware and Hantro encoders
> (no upstream driver yet).
>
> let us know your thought, should we adopt just one, and have driver translate
> once HW moved to the new approach ? Should we enventually support both ?
>
> Nicolas

Hi Nicolas,

Thanks for the quick comments.
Qp map for can be too much data to be sent from user space to firmware 
via control per frame.
Ex: Avc has mbsize as 16 and the max mbpf iris driver supports is 8192x4352.
This would mean 136kb of data (8bit Qp) needs to be transferred for each 
frame in worst case.
While are still evaluating Qp map option, due to firmware performance 
issues we are
gravitating more towards rectangle ROI.
I am not sure if we will need to support the Qp map in the future.

>> I'm sharing this series as an RFC because adding support
>> in the firmware and framework for testing, gstreamer testing
>> is still in progress. I would appreciate early feedback on
>> the design, implementation, and fixes before moving to a
>> formal submission.
>>
>> v4l2-ctl -d /dev/video1 --list-ctrls
>> ..
>> hevc_b_frame_maximum_qp_value 0x00990b8c (int): min=1 max=51 step=1
>> default=51 value=51 flags=has-min-max
>> video_encoder_roi_params 0x00990b92 (unknown): type=284
>> value=unsupported payload type flags=has-payload
>>
>> Thanks,
>> Deepa
>>
>> Signed-off-by: Deepa Guthyappa Madivalara <deepa.madivalara@oss.qualcomm.com>
>> ---
>> Deepa Guthyappa Madivalara (3):
>>        media: uapi: Introduce new control for video encoder ROI
>>        media: v4l2-core: Add support for video encoder ROI control
>>        media: iris: Add ROI support framework for video encoder
>>
>>   .../userspace-api/media/v4l/ext-ctrls-codec.rst    |  7 +++
>>   drivers/media/platform/qcom/iris/iris_ctrls.c      | 54
>> +++++++++++++++++++++-
>>   drivers/media/platform/qcom/iris/iris_ctrls.h      |  1 +
>>   .../platform/qcom/iris/iris_platform_common.h      |  4 ++
>>   .../media/platform/qcom/iris/iris_platform_gen2.c  |  8 ++++
>>   drivers/media/v4l2-core/v4l2-ctrls-core.c          | 14 +++++-
>>   drivers/media/v4l2-core/v4l2-ctrls-defs.c          |  5 ++
>>   include/media/v4l2-ctrls.h                         |  1 +
>>   include/uapi/linux/v4l2-controls.h                 |  1 +
>>   include/uapi/linux/videodev2.h                     | 17 +++++++
>>   10 files changed, 110 insertions(+), 2 deletions(-)
>> ---
>> base-commit: f417b7ffcbef7d76b0d8860518f50dae0e7e5eda
>> change-id: 20260112-iris_enc_roi-8898f9a2455f
>>
>> Best regards,
Re: [RFC PATCH 0/3] Implement Region of Interest(ROI) support.
Posted by Nicolas Dufresne 3 weeks, 3 days ago
Hi,

Le mercredi 14 janvier 2026 à 14:14 -0800, Deepa Guthyappa Madivalara a écrit 


[...]

> 
> Thanks for the quick comments.
> Qp map for can be too much data to be sent from user space to firmware 
> via control per frame.
> Ex: Avc has mbsize as 16 and the max mbpf iris driver supports is 8192x4352.
> This would mean 136kb of data (8bit Qp) needs to be transferred for each 
> frame in worst case.
> While are still evaluating Qp map option, due to firmware performance 
> issues we are
> gravitating more towards rectangle ROI.
> I am not sure if we will need to support the Qp map in the future.

Have you read how this is implemented in Vulkan and D3D12 ? Please have a read:

- Vulkan Video, see quantizationMapTexelSize [0]
- D3D, see QPMapRegionPixelsSize [1]

[0] https://docs.vulkan.org/features/latest/features/proposals/VK_KHR_video_encode_quantization_map.html
[1] https://microsoft.github.io/DirectX-Specs/d3d/D3D12_Video_Encoding_Texture_QPMap_DirtyMap_MotionVectors.html

Note that D3D also support dirty regions (what you call ROI in this proposal),
with no limits, since these are translated into map by drivers (its a software
feature on top) and motion search hints, that one seems rare.

I'm not against having ROI in our API, its common in older chips designs, but
its clearly going away in the long run since most fixed hardware impose very low
region count, which is not usable for modern application. ROI it trivial to
implement on top of QP maps.

A typical use case for that is to use lightweight AI or traditional CV to locate
most relevant portion of a video. The result is more like a heat map, not a set
of rectangles. Then we roughly map that in a low granularity QPMap before
encoding. This allow maintaining very low bandwidth, while preserving the
information needed for the heavier processing in the cloud. I'm including one of
the many example of that, this is a talk from Spiideo [2].

[2] https://gstconf.ubicast.tv/videos/region-based-compression-in-gstreamer/


regards,
Nicolas
Re: [RFC PATCH 0/3] Implement Region of Interest(ROI) support.
Posted by Deepa Guthyappa Madivalara 3 weeks, 2 days ago
On 1/15/2026 5:42 AM, Nicolas Dufresne wrote:
> Hi,
>
> Le mercredi 14 janvier 2026 à 14:14 -0800, Deepa Guthyappa Madivalara a écrit
>
>
> [...]
>
>> Thanks for the quick comments.
>> Qp map for can be too much data to be sent from user space to firmware
>> via control per frame.
>> Ex: Avc has mbsize as 16 and the max mbpf iris driver supports is 8192x4352.
>> This would mean 136kb of data (8bit Qp) needs to be transferred for each
>> frame in worst case.
>> While are still evaluating Qp map option, due to firmware performance
>> issues we are
>> gravitating more towards rectangle ROI.
>> I am not sure if we will need to support the Qp map in the future.
> Have you read how this is implemented in Vulkan and D3D12 ? Please have a read:
>
> - Vulkan Video, see quantizationMapTexelSize [0]
> - D3D, see QPMapRegionPixelsSize [1]
>
> [0] https://docs.vulkan.org/features/latest/features/proposals/VK_KHR_video_encode_quantization_map.html
> [1] https://microsoft.github.io/DirectX-Specs/d3d/D3D12_Video_Encoding_Texture_QPMap_DirtyMap_MotionVectors.html
>
> Note that D3D also support dirty regions (what you call ROI in this proposal),
> with no limits, since these are translated into map by drivers (its a software
> feature on top) and motion search hints, that one seems rare.
>
> I'm not against having ROI in our API, its common in older chips designs, but
> its clearly going away in the long run since most fixed hardware impose very low
> region count, which is not usable for modern application. ROI it trivial to
> implement on top of QP maps.
>
> A typical use case for that is to use lightweight AI or traditional CV to locate
> most relevant portion of a video. The result is more like a heat map, not a set
> of rectangles. Then we roughly map that in a low granularity QPMap before
> encoding. This allow maintaining very low bandwidth, while preserving the
> information needed for the heavier processing in the cloud. I'm including one of
> the many example of that, this is a talk from Spiideo [2].
>
> [2] https://gstconf.ubicast.tv/videos/region-based-compression-in-gstreamer/
>
>
> regards,
> Nicolas
Thank you for the detailed information. I reviewed the documentation
and agree with your assessment. I am following up with my team to 
explore this further.

Thanks,
Deepa

Re: [RFC PATCH 0/3] Implement Region of Interest(ROI) support.
Posted by Deepa Guthyappa Madivalara 2 weeks, 2 days ago
On 1/15/2026 4:38 PM, Deepa Guthyappa Madivalara wrote:
>
> On 1/15/2026 5:42 AM, Nicolas Dufresne wrote:
>> Hi,
>>
>> Le mercredi 14 janvier 2026 à 14:14 -0800, Deepa Guthyappa Madivalara 
>> a écrit
>>
>>
>> [...]
>>
>>> Thanks for the quick comments.
>>> Qp map for can be too much data to be sent from user space to firmware
>>> via control per frame.
>>> Ex: Avc has mbsize as 16 and the max mbpf iris driver supports is 
>>> 8192x4352.
>>> This would mean 136kb of data (8bit Qp) needs to be transferred for 
>>> each
>>> frame in worst case.
>>> While are still evaluating Qp map option, due to firmware performance
>>> issues we are
>>> gravitating more towards rectangle ROI.
>>> I am not sure if we will need to support the Qp map in the future.
>> Have you read how this is implemented in Vulkan and D3D12 ? Please 
>> have a read:
>>
>> - Vulkan Video, see quantizationMapTexelSize [0]
>> - D3D, see QPMapRegionPixelsSize [1]
>>
>> [0] 
>> https://docs.vulkan.org/features/latest/features/proposals/VK_KHR_video_encode_quantization_map.html
>> [1] 
>> https://microsoft.github.io/DirectX-Specs/d3d/D3D12_Video_Encoding_Texture_QPMap_DirtyMap_MotionVectors.html
>>
>> Note that D3D also support dirty regions (what you call ROI in this 
>> proposal),
>> with no limits, since these are translated into map by drivers (its a 
>> software
>> feature on top) and motion search hints, that one seems rare.
>>
>> I'm not against having ROI in our API, its common in older chips 
>> designs, but
>> its clearly going away in the long run since most fixed hardware 
>> impose very low
>> region count, which is not usable for modern application. ROI it 
>> trivial to
>> implement on top of QP maps.
>>
>> A typical use case for that is to use lightweight AI or traditional 
>> CV to locate
>> most relevant portion of a video. The result is more like a heat map, 
>> not a set
>> of rectangles. Then we roughly map that in a low granularity QPMap 
>> before
>> encoding. This allow maintaining very low bandwidth, while preserving 
>> the
>> information needed for the heavier processing in the cloud. I'm 
>> including one of
>> the many example of that, this is a talk from Spiideo [2].
>>
>> [2] 
>> https://gstconf.ubicast.tv/videos/region-based-compression-in-gstreamer/
>>
>>
>> regards,
>> Nicolas
> Thank you for the detailed information. I reviewed the documentation
> and agree with your assessment. I am following up with my team to 
> explore this further.
>
> Thanks,
> Deepa
Hi Nicolas,

Thanks for the references. I was able to go through these and speak to
the teams internally. Here is the understandings and proposal for ROI.

To support rectangle QP, userspace will set compound control and
driver will convert this rectangle QP data to firmware understandable QP 
format
(similar to MB based QP format) and send it to firmware.

To support MB based QP, we need more inputs on how we can send 136KB
QP data eg. 8kUHD frame (one byte QP per 16x16MB) from GST / Userspace to
video driver. Can we send using compound control or any other alternate 
approaches
available?

Thanks
Deepa