[PATCH v7 0/4] Performance improvement of decoder

Jackson.lee posted 4 patches 1 week, 5 days ago
.../platform/chips-media/wave5/wave5-helper.c |  28 ++-
.../platform/chips-media/wave5/wave5-helper.h |   1 +
.../platform/chips-media/wave5/wave5-hw.c     |   2 +-
.../chips-media/wave5/wave5-vpu-dec.c         | 189 +++++++++++++-----
.../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
.../platform/chips-media/wave5/wave5-vpu.c    |  98 ++++++++-
.../platform/chips-media/wave5/wave5-vpu.h    |   2 +-
.../platform/chips-media/wave5/wave5-vpuapi.c |  68 ++++---
.../platform/chips-media/wave5/wave5-vpuapi.h |  12 ++
.../chips-media/wave5/wave5-vpuconfig.h       |   1 +
10 files changed, 310 insertions(+), 99 deletions(-)
[PATCH v7 0/4] Performance improvement of decoder
Posted by Jackson.lee 1 week, 5 days ago
From: Jackson Lee <jackson.lee@chipsnmedia.com>

v4l2-compliance results:
========================

v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t

Buffer ioctls:
                warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
                warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
        test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
        test CREATE_BUFS maximum buffers: OK
        test VIDIOC_EXPBUF: OK
        test Requests: OK (Not Supported)

Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0, Warnings: 2
Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0, Warnings: 0

Fluster test results:
=====================

Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-Gst1.0 Using 3 parallel job(s)
Ran 133/147 tests successfully              in 61.467 secs

(1 test fails because of not supporting to parse multi frames, 1 test fails because of a missing frame and slight corruption,
 2 tests fail because of sizes which are incompatible with the IP, 11 tests fail because of unsupported 10 bit format)


Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-Gst1.0 Using 3 parallel job(s)
Ran 78/135 tests successfully               in 45.083 secs

(57 fail because the hardware is unable to decode  MBAFF / FMO / Field / Extended profile streams.)

Running test suite JVT-FR-EXT with decoder GStreamer-H.264-V4L2-Gst1.0 Using 3 parallel job(s)
Ran 25/69 tests successfully               in 15.176 secs

(44 fail because the hardware does not support field encoded and 422 encoded stream)

Seek test
=====================
1. gst-play-1.0 seek.264
2. this will use waylandsink since gst-play-1.0 uses playbin.
   if you don't want to hook up display,
   you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
3. Let pipeline run for 2-3 seconds
4. press SPACE key to pause
5. press 0 to reset press SPACE to start again

gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a list of keyboard shortcuts.
Now playing /root/seek.264
Redistribute latency...
Redistribute latency...
Redistribute latency...
Redistribute latency...
Redistribute latency...aused
0:00:09.9 / 0:00:09.9
Reached end of play list.


Sequence Change test
=====================
gst-launch-1.0 filesrc location=./drc.h264 ! h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Redistribute latency...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
Redistribute latency...
Got EOS from element "pipeline0".
Execution ended after 0:00:00.100759170
Setting pipeline to NULL ...
Freeing pipeline ...

Change since v6:
* For [PATCH v4 4/4] media: chips-media: wave5: Improve performance of decoder
 - change code of inst_src_buf_remove
 - add set_instance_state to remove code redundancy

Change since v5:
================
* For [PATCH v4 4/4] media: chips-media: wave5: Improve performance of decoder
 - reduce high cpu usage while playback adaptive streaming from
   streaming server

* For [PATCH v4 1/4] media: chips-media: wave5: Fix SError of kernel panic when closed
 - fix kernel panic when printing a lot of log messages

Change since v4:
=================
* For [PATCH v5 4/4] media: chips-media: wave5: Improve performance of decoder
 - fix the error which the Media CI rebot reported

* For [PATCH v5 2/4] media: chips-media: wave5: Fix Null reference while testing fluster
 - fix the error which the Media CI rebot reported

Change since v3:
==================
* For [PATCH v4 4/4] media: chips-media: wave5: Improve performance of decoder
 - fix crash and dead lock while testing seek

* For [PATCH v4 3/4] media: chips-media: wave5: Add WARN_ON to check if dec_output_info is NULL
 - update commit message

* For [PATCH v4 2/4] media: chips-media: wave5: Fix Null reference while testing fluster
 - add thread irq logic

* For [PATCH v4 1/4] media: chips-media: wave5: Fix SError of kernel panic when closed
 - add Reviewed-by tag

Change since v2:
==================
* For [PATCH v3 4/4] media: chips-media: wave5: Improve performance of decoder
 - squash v2's #3~#6 to #4 patch of v3

Change since v1:
===================
* For [PATCH v2 2/7] media: chips-media: wave5: Improve performance of decoder
 - change log to dbg level

Change since v0:
===================
* For [PATCH v1 2/7] media: chips-media: wave5: Improve performance of decoder
 - separates the previous patch to a few patches

* For [PATCH v1 3/7] media: chips-media: wave5: Fix not to be closed
 - separated from the previous patch of performance improvement of
   decoder

* For [PATCH v1 4/7] media: chips-media: wave5: Use spinlock whenever state is changed
 - separated from the previous patch of performance improvement of
   decoder

* For [PATCH v1 5/7] media: chips-media: wave5: Fix not to free resources normally when
    instance was destroyed
 - separated from the previous patch of performance improvement of
   decoder

* For [PATCH v1 7/7] media: chips-media: wave5: Fix SError of kernel panic when closed
 - separated from the previous patch of performance improvement of
   decoder

Jackson Lee (4):
  media: chips-media: wave5: Fix SError of kernel panic when closed
  media: chips-media: wave5: Fix Null reference while testing fluster
  media: chips-media: wave5: Add WARN_ON to check if dec_output_info is
    NULL
  media: chips-media: wave5: Improve performance of decoder

 .../platform/chips-media/wave5/wave5-helper.c |  28 ++-
 .../platform/chips-media/wave5/wave5-helper.h |   1 +
 .../platform/chips-media/wave5/wave5-hw.c     |   2 +-
 .../chips-media/wave5/wave5-vpu-dec.c         | 189 +++++++++++++-----
 .../chips-media/wave5/wave5-vpu-enc.c         |   8 +-
 .../platform/chips-media/wave5/wave5-vpu.c    |  98 ++++++++-
 .../platform/chips-media/wave5/wave5-vpu.h    |   2 +-
 .../platform/chips-media/wave5/wave5-vpuapi.c |  68 ++++---
 .../platform/chips-media/wave5/wave5-vpuapi.h |  12 ++
 .../chips-media/wave5/wave5-vpuconfig.h       |   1 +
 10 files changed, 310 insertions(+), 99 deletions(-)

-- 
2.43.0
Re: [PATCH v7 0/4] Performance improvement of decoder
Posted by Brandon Brnich 6 hours ago
Hi Jackson,

On 11/19/2025 12:25 AM, Jackson.lee wrote:
> From: Jackson Lee <jackson.lee@chipsnmedia.com>
> 
> v4l2-compliance results:
> ========================
> 
> v4l2-compliance 1.28.1-5233, 64 bits, 64-bit time_t
> 
> Buffer ioctls:
>                  warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
>                  warn: v4l2-test-buffers.cpp(693): VIDIOC_CREATE_BUFS not supported
>          test VIDIOC_REQBUFS/CREATE_BUFS/QUERYBUF: OK
>          test CREATE_BUFS maximum buffers: OK
>          test VIDIOC_EXPBUF: OK
>          test Requests: OK (Not Supported)
> 
> Total for wave5-dec device /dev/video0: 46, Succeeded: 46, Failed: 0, Warnings: 2
> Total for wave5-enc device /dev/video1: 46, Succeeded: 46, Failed: 0, Warnings: 0
> 
> Fluster test results:
> =====================
> 
> Running test suite JCT-VC-HEVC_V1 with decoder GStreamer-H.265-V4L2-Gst1.0 Using 3 parallel job(s)
> Ran 133/147 tests successfully              in 61.467 secs
> 
> (1 test fails because of not supporting to parse multi frames, 1 test fails because of a missing frame and slight corruption,
>   2 tests fail because of sizes which are incompatible with the IP, 11 tests fail because of unsupported 10 bit format)
> 
> 
> Running test suite JVT-AVC_V1 with decoder GStreamer-H.264-V4L2-Gst1.0 Using 3 parallel job(s)
> Ran 78/135 tests successfully               in 45.083 secs
> 
> (57 fail because the hardware is unable to decode  MBAFF / FMO / Field / Extended profile streams.)
> 
> Running test suite JVT-FR-EXT with decoder GStreamer-H.264-V4L2-Gst1.0 Using 3 parallel job(s)
> Ran 25/69 tests successfully               in 15.176 secs
> 
> (44 fail because the hardware does not support field encoded and 422 encoded stream)
> 
> Seek test
> =====================
> 1. gst-play-1.0 seek.264
> 2. this will use waylandsink since gst-play-1.0 uses playbin.
>     if you don't want to hook up display,
>     you can run gst-play-1.0 seek.264 --videosink=fakevideosink instead
> 3. Let pipeline run for 2-3 seconds
> 4. press SPACE key to pause
> 5. press 0 to reset press SPACE to start again
> 
> gst-play-1.0 seek.264 --videosink=fakevideosink Press 'k' to see a list of keyboard shortcuts.
> Now playing /root/seek.264
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...
> Redistribute latency...aused
> 0:00:09.9 / 0:00:09.9
> Reached end of play list.
> 
> 
> Sequence Change test
> =====================
> gst-launch-1.0 filesrc location=./drc.h264 ! h264parse ! v4l2h264dec ! filesink location=./h264_output_420.yuv
> Setting pipeline to PAUSED ...
> Pipeline is PREROLLING ...
> Redistribute latency...
> Pipeline is PREROLLED ...
> Setting pipeline to PLAYING ...
> New clock: GstSystemClock
> Redistribute latency...
> Got EOS from element "pipeline0".
> Execution ended after 0:00:00.100759170
> Setting pipeline to NULL ...
> Freeing pipeline ...
> 

I have performed all the same testing you have listed above on this 
series as well as some additional scenarios to validate performance and 
reliability. These tests were conducted using interrupts and the hrtimer 
case. The only difference I have is that my v4l2-compliance reports 47 
out of 47 tests passed instead of your 46 out of 46 - I'm sure this is 
just a version mismatch between us.

In these tests I have validated the bug I reported in patch 1 is now 
fixed. There are no more SError panics when logging is high.

I have also validated that 4K60, 2x4K30, and 8x1080p30 are meeting 
timing requirements. Those tests apply to both H264 and H265. These 
tests were conducted on TI's J784s4 SoC where I increased the CMA size 
so that Wave5 could have access to enough memory for these tests to 
work. Each of the mentioned scenarios were also ran for 30+ minutes just 
to ensure that longer running cases are valid as well.

For the series,

Tested-by: Brandon Brnich <b-brnich@ti.com>

Best,
Brandon