From nobody Thu Apr 16 12:32:24 2026 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 958D6346AF4 for ; Fri, 27 Feb 2026 17:20:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212846; cv=none; b=UzvWIgU8TfxXiEhPD76PeDuQVwLquAvce54MRotNEzk22hSydhqaE6DANORmVqeOERhW8mqm8kbbc02uwuwHANwMPbUeJmivQV4JqIV0FFffiPDIjVz1EZ//FowztpmLI5R2H96euhjSDS5Xi8olWNRseZ9sKMcILLkWW7SY94Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212846; c=relaxed/simple; bh=rRr3+vifUdF0hNY60dwmGGafi8luWOxE/yqZKA9fp8c=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Wm4QpbKnscgOPsH6z0cV+KXUNSrRAvuYVTsM9Rds3oznzeK37ojQWi4m36jFJMyLftv4kRzw60yuuZTfqS5k9XdLN9Yn77KEDrMMKJSLsW/lCiswGqsQacplawPia3TQepydz9BRVhiHYvWIL5Eek80Sx/q7htErSR58Nd9QIlU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com; spf=pass smtp.mailfrom=raspberrypi.com; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b=kEhDi+F1; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b="kEhDi+F1" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-48375f1defeso16615455e9.0 for ; Fri, 27 Feb 2026 09:20:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1772212843; x=1772817643; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=yNaltXKAWAjQbwu3MqBmJgovVgpV61nuOt86Xr0pRVM=; b=kEhDi+F1zP4qu44zqriTDR379vWeEoMtS/jGVQ38ndii97mpgfn6OVr4e1pZCUCjOL /EdP54Y6gWW47p9nwh5tmfNYGS5fB0e3mU9OU7uqAdqG0a6+CE2AA/V5PupgMJPGjGEB 8lKIclT+duw2ennoR0ePf+/buGe87oOyNHejQX7QG62L90qG1wlpu/QzkFWA1OLkSFEm n3sJiW1/KEiK4WnFh8mM+6j/wv9j96rBIzte9RnjIMhZDR8pst2tR445zXJTd04zBMPV 36x5y9rK9hIJKJVhjgC378Go1stW+Agg1qCQD+Go4ku5YVb0aoK1D66QrrWEmgFTAAX8 Y93A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772212843; x=1772817643; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=yNaltXKAWAjQbwu3MqBmJgovVgpV61nuOt86Xr0pRVM=; b=DidqgQSTcgPrgs9m2WqtUeM2YA43spszHerPks11D2T02zfpeE6YcSyfYQBVnPq0za r8iZsCPew4gjhOfIwaNN6WUEm9eVyNdDyZmPTV2zz0ZQaVJ9pGjlNR/T2xJVW7SrpxZQ Snnf6ytzCk1jPW4IbMLVhDqIGRkTJDPzBI6Is0loGZq62yTv/zDiVzhY4x4IGqaSVA22 7DCZenezjg1JKVH8tJOPqTJaFJAiDW/Rsja2fOYPWRK0XEcty5MD/PNwBz0bG15OaTbI 73bwI1e4gHFsUZFRE/2nslvPwSBXEyKlUN83bdxz5+mOBUIBdiFa4+9c8aVpZrxSos/F YkHg== X-Forwarded-Encrypted: i=1; AJvYcCUaiTegLfAXLylNyPlXBCJksjTYRtPI1cEvzFHSC/WiHDPeHB3/10QIEurH681j02o4hd5kIie1v0cQ4YI=@vger.kernel.org X-Gm-Message-State: AOJu0YwsJ63jwGZk4a9He0bxAQ9c/H37vhA1L7PalyB0Ab+5xS7hkvN9 8ylwbLCgBoyZt4E+a2FVEYFeqULxih0BsBuMxjObTeT5j7H99bQlVIrqdaxHt6Z1XsI= X-Gm-Gg: ATEYQzxRBHd30nlp3cC4g93bo3IjZPcmnquhL19nCOGUWw/4GdcE2JNhYs2xKlvkWTi Ck7VRdsOmcJNkbU0O9f0vNFt/z+Iq1QKRa3CzG9Nx0nJ7KAP8ZvSr86PbgpZ/7ts+C7RgbhHFFs VerujisR/weusoJ4v5HTtS7glGhMAnKqK2FE6AayLrDNbAeyZc5jOz7fIJ/Cw1DKzqrZ67ds6cM SPOADJvTp2Y4DNNbe0iJddOvouDflI4JGTnJhXZSr69bN9kWVFlY4yfAV7rREgz3HzMncSnhBRc uj8BJOkOZ5wiOh1EhLXltOlAdRWrQCcQkauLaC0TnfCHt6SGSJEp10lg2/ZRzg81t51KdSx/43T qzW1n1LEg6a1faTyoT6uCtd5+6JGTCt87oT0YI7D5uNH1tIwRNL15oleU3hQRK3vAruqNCDPBuG ag8MzhDKNg7VRGBQ== X-Received: by 2002:a05:600c:4715:b0:47d:18b0:bb9a with SMTP id 5b1f17b1804b1-483c9c2a51cmr54872485e9.33.1772212842981; Fri, 27 Feb 2026 09:20:42 -0800 (PST) Received: from [127.0.1.1] ([2a00:1098:3142:e::8]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-483bfeb932bsm60828075e9.28.2026.02.27.09.20.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 09:20:42 -0800 (PST) From: Dave Stevenson Date: Fri, 27 Feb 2026 17:19:06 +0000 Subject: [PATCH v5 1/6] docs: uapi: media: Clarify HEVC slice_param bit_size, data_byte_offset Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260227-media-rpi-hevc-dec-v5-1-9bb3fc1816de@raspberrypi.com> References: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> In-Reply-To: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> To: Sakari Ailus , Laurent Pinchart , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Florian Fainelli , Broadcom internal kernel review list , John Cox , Dom Cobley , review list , Ezequiel Garcia Cc: Nicolas Dufresne , John Cox , Stefan Wahren , linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Dave Stevenson X-Mailer: b4 0.14.1 From: John Cox Clarify exactly what bit_size and data_byte_offset mean when there are multiple slices in the bitstream data. Signed-off-by: John Cox Signed-off-by: Dave Stevenson --- Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst | 6 ++= ++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateles= s.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst index 3b1e05c6eb13..a54e8ea29440 100644 --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec-stateless.rst @@ -2399,10 +2399,12 @@ This structure contains all loop filter related par= ameters. See sections =20 * - __u32 - ``bit_size`` - - Size (in bits) of the current slice data. + - Size in bits of the slice_segment_data for the current slice inclu= ding + any emulation prevention bytes. * - __u32 - ``data_byte_offset`` - - Offset (in byte) to the video data in the current slice data. + - Offset in bytes from the start of the current v4l2_buffer to the s= tart + of the slice_segment_data for the current slice. * - __u32 - ``num_entry_point_offsets`` - Specifies the number of entry point offset syntax elements in the = slice header. --=20 2.34.1 From nobody Thu Apr 16 12:32:24 2026 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCD07438FF0 for ; Fri, 27 Feb 2026 17:20:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212849; cv=none; b=RbV8NbPsyoUl0iL1JKBvlfsOnbjvGlpWXhMgEDv8OklxYe8Ykb5OmClV3yYy+kYlTsJmV3EbtgGDtnPsBjSHrbuVGX+LN8kkRlquiyK8zd9/4jCvAflXCukOONAIEoytZrhLRYWSllgvxlUL7orbtfgSd87+7K3s99Fcb76rxg0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212849; c=relaxed/simple; bh=Qi+4Z2dWtRTIyItgexgTfkuMEbxlrCjpu1oIkL86EJw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=JDK5pbI9OEvqHT5UyzuufMVw/M7aNU6A/jGnM01qhxInIF5s+Un8J3K0ofS3l5yfxnQaRJwz8A8qqxISgZU9BJrjnhHiX3LxPjeNqR1K4LJG2UBcOWv+rXYleJexM53G/YQNscqE/15wnJVcgybhNjGFbCJfFLUI+i8iJ5WW1yU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com; spf=pass smtp.mailfrom=raspberrypi.com; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b=dUJs5k0x; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b="dUJs5k0x" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-4838c15e3cbso20547625e9.3 for ; Fri, 27 Feb 2026 09:20:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1772212844; x=1772817644; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=7ljANxlu8ndrw3XskAAbixI7/YE21Bcmq1KmY/q5I1k=; b=dUJs5k0x2ykRRt4rIqGc8IcLtzFXV3aTrDKi9K829+LWqibew+ddsUle4xxTSoT6ev y36gujp84vngJ5x5dLOP3TV8MNt/m+kes966WCitm14+Pg0VYCkURaIBJ/RyvxL7g+BK DMLSyh/4ufbhDLqebWyy5nA14dGQ04ZieV8xR9F3Tv5j0e3YmT3Jy8s/gldxEQ+eFyXR o3vuXeLDhFNTAm6Dyao2EQg4l5hZHQpwUyTIdyXACyx5tsdUjrvpl7TvHXNPvGRS6w7w B5nv/x5Sa4hRkdE8MYdYN/nMzLnYbCLMrq5YMVbffzCyaBP0If1mQwwl9VleyNJ3lQRp S1Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772212844; x=1772817644; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=7ljANxlu8ndrw3XskAAbixI7/YE21Bcmq1KmY/q5I1k=; b=cWFdzhiwnqQLz17RR4l/olBbaY8NrdWne22LQb0+TW9f8eXeW02lLG9HlbLNo9uTR3 tI/NIuMh5F9nB2USVe/QllNh553LceXLzv+dZMb8X56XqD8Ddf7ANW3uqE/ihWqmd1UY mLZiJFTqI44+5QpuZUyoYcFsEuhbDLJVFa4gVGroHhH3wQ3w1hrVUrtDqqVK+xcsu7ti 2J4Mafm5iNMQK9HF0zXRwW19WLPhVuH+W4a+IvCZDtjKi1y91BlnZrFSg+uKQTI1cBBW CAIcKELt9hX2mwr1IaZRbblEwSwBI2KO/dUUeNPobdcuAgRyXtlK1Ny8mWr99Y7HMXMK qsKg== X-Forwarded-Encrypted: i=1; AJvYcCVc7G58lkrH754aSCOZDjAcUUTH70pKwiI4SosIAJ210mMXIhMgbrdsk2fT2QBHsGF6Kcq9S9GX1KnMdKY=@vger.kernel.org X-Gm-Message-State: AOJu0YwB8jhGne96UdzhVqHXRKcxs2YPQqIIgJD9yWQfLrQ5tOw/GEky HdomEj9g4U95CR4/wTlyW92qLM7Tr+kQdgyexajF6aIObcuEeKxP4bOUoGxgDY3gGtc= X-Gm-Gg: ATEYQzznX+rIXNxOUi9toW2FTcS1SXZcJUNDmqgwskqY4HL3Xv+bGuR7eGFj/gJTmzB rKFAvgc7MRrvYqD9RaPG11tajO5Gg2/itdmE9fXd21JmvRFta5m7imfFTa02lz83ZsIv4BKP7IB yNdCHx9zs9VC3LhHyxj4nVrOIqX6ES04x6HcqiGQ60VjzLVSRDiIUfmpgPb8M++5LLBb48nDVSc Tgf6DAUGa/4masfiCLJ8sn9g4LViYOgfQJNBxytFZ4Sr/5eW4FPeNPFhlJ6oKue0wiBOIu+LShJ PQnf+V0DUq919iPsU8Xo6+rcNiJU7zBg10uZ3S7RB2kMKUx1KoqBrnqtJizBBf78gr3Bg36C3Ka 2uavoZLJo1snMAVmYbgfo3aNP/89lmJKE3PXw/RqjLsUE0GIbh4tRFSHOp+rk5xveJ1e4TvQ842 XQ1VkfB+B76nFz+g== X-Received: by 2002:a05:600c:630d:b0:477:a1a2:d829 with SMTP id 5b1f17b1804b1-483c9bdb68emr61023005e9.13.1772212844085; Fri, 27 Feb 2026 09:20:44 -0800 (PST) Received: from [127.0.1.1] ([2a00:1098:3142:e::8]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-483bfeb932bsm60828075e9.28.2026.02.27.09.20.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 09:20:43 -0800 (PST) From: Dave Stevenson Date: Fri, 27 Feb 2026 17:19:07 +0000 Subject: [PATCH v5 2/6] docs: uapi: media: Document Raspberry Pi NV12 column format Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260227-media-rpi-hevc-dec-v5-2-9bb3fc1816de@raspberrypi.com> References: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> In-Reply-To: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> To: Sakari Ailus , Laurent Pinchart , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Florian Fainelli , Broadcom internal kernel review list , John Cox , Dom Cobley , review list , Ezequiel Garcia Cc: Nicolas Dufresne , John Cox , Stefan Wahren , linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Dave Stevenson X-Mailer: b4 0.14.1 The Raspberry Pi HEVC decoder uses a tiled format based on columns for 8 and 10 bit YUV images, so document them as NV12MT_COL128 and NV12MT_10_COL128. Signed-off-by: Dave Stevenson --- .../userspace-api/media/v4l/pixfmt-yuv-planar.rst | 42 ++++++++++++++++++= ++++ 1 file changed, 42 insertions(+) diff --git a/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst b/= Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst index 0631919bd667..1e7146230f09 100644 --- a/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst +++ b/Documentation/userspace-api/media/v4l/pixfmt-yuv-planar.rst @@ -956,6 +956,48 @@ Data in the 12 high bits, zeros in the 4 low bits, arr= anged in little endian ord - Cb\ :sub:`11` - Cr\ :sub:`11` =20 +V4L2_PIX_FMT_NV12MT_COL128 and V4L2_PIX_FMT_NV12MT_10_COL128 +------------------------------------------------------------ + +``V4L2_PIX_FMT_NV12MT_COL128`` is a tiled version of +``V4L2_PIX_FMT_NV12M`` where the two planes are split into 128 byte wide c= olumns +of Y or interleaved CbCr. The height is always aligned to a multiple of 8 = lines. + +V4L2_PIX_FMT_NV12MT_10_COL128 expands that as a 10 bit format where 3 10 b= it +values are packed into a 32bit word. A 128 byte wide column therefore hold= s 96 +samples (either Y or interleaved CrCb). That effectively makes it 6 values= in a +64 bit word for the CbCr plane, as the values always go in pairs. + +Bit-packed representation. + +.. tabularcolumns:: |p{1.2cm}||p{1.2cm}||p{1.2cm}||p{1.2cm}|p{3.2cm}|p{3.2= cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 8 8 8 8 + + * - Y'\ :sub:`00[7:0]` + - Y'\ :sub:`01[5:0] (bits 7--2)` Y'\ :sub:`00[9:8]`\ (bits 1--0) + - Y'\ :sub:`02[3:0] (bits 7--4)` Y'\ :sub:`01[9:6]`\ (bits 3--0) + - unused (bits 7--6)` Y'\ :sub:`02[9:4]`\ (bits 5--0) + +.. tabularcolumns:: |p{1.2cm}||p{1.2cm}||p{1.2cm}||p{1.2cm}|p{3.2cm}|p{3.2= cm}| + +.. flat-table:: + :header-rows: 0 + :stub-columns: 0 + :widths: 12 12 12 12 12 12 12 12 + + * - Cb\ :sub:`00[7:0]` + - Cr\ :sub:`00[5:0]`\ (bits 7--2) Cb\ :sub:`00[9:8]`\ (bits 1--0) + - Cb\ :sub:`01[3:0]`\ (bits 7--4) Cr\ :sub:`00[9:6]`\ (bits 3--0) + - unused (bits 7--6) Cb\ :sub:`02[9:4]`\ (bits 5--0) + - Cr\ :sub:`01[7:0]` + - Cb\ :sub:`02[5:0]`\ (bits 7--2) Cr\ :sub:`01[9:8]`\ (bits 1--0) + - Cr\ :sub:`02[3:0]`\ (bits 7--4) Cb\ :sub:`02[9:6]`\ (bits 3--0) + - unused (bits 7--6) Cr\ :sub:`02[9:4]`\ (bits 5--0) + =20 Fully Planar YUV Formats =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --=20 2.34.1 From nobody Thu Apr 16 12:32:24 2026 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1681D36403A for ; Fri, 27 Feb 2026 17:20:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212851; cv=none; b=CuaSJZOA6D8oJSYHHp/O/XjKJ/lNldqbvKH6UfjYamHhfPLVneUKV28wXSZ94O92JTvH9n91AiYcyFx8D9gLH1PfcOFQH3htMaJqT7ogXEHD73yVR2ODfU7dDV9dn+e5CKKFP9SsEiixgw5zGwQzbVAWghL9CY749zaYMN2doe8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212851; c=relaxed/simple; bh=Yw+jJSG+gjftaI890ltLw9XWLxe0BcSmmB7/KW90xkc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VsWRZp8JV3SlQ0++ak/F/8o+OhqDcur/Am3lCc6x+GThpXO3UnOYFVMZs74RXnMzSvtdGM8/qmu88diY5NXp15JmDCIisYa3lRZx5f5Z0V9zam1m6Zo+Bo65gLXe4NqrycVEAMruPOViWqr3RY/G7tySKTRTxb1oJ5boUBTHgrU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com; spf=pass smtp.mailfrom=raspberrypi.com; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b=j6DK8+Tf; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b="j6DK8+Tf" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-483bd7354efso29712325e9.2 for ; Fri, 27 Feb 2026 09:20:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1772212845; x=1772817645; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=D22wsJVQwr9074cfNWay5oaTG2nKsW+DfTWeOhL5zHU=; b=j6DK8+Tffzq2UmABcPW73rl8iQpHUyLm2AVReHjjHDZsS7+IyhiXhtIk7oOpCbusZr IxpjQODQGdO2dt5ZIaMwENQBPB/FnzCpLmNmGljFmzLsWx/Br9v8qNrpkaIE0PHpaCCv zCGsevUpeis6Hd7o51av7zqQ5223agX0bFySWgbkLSMnU+u69rmXrmF0krYrcVLYkPSc CmDXOmNnToSV/YbsE3M01h1dxms8YuGkg3msceBNQV7jfjS+RuqDUPTCcPrPgc6qROPT fyhJOLOkxvy4zxmApMWfRFgEKOiPgOChaGKQBJYMZzh8wFUx4qHaUd5lzzsKiSEuveYF eZPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772212845; x=1772817645; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=D22wsJVQwr9074cfNWay5oaTG2nKsW+DfTWeOhL5zHU=; b=KiTJ7shf0Ac2JMAqVdO9zTVZp6sd83eh9oqAxCscRrw9Y6aMordLcmncpmk4xzj0Wb vjXsV6p0CBv1HE9RhzPVj8VPvx3/khx/gMQ7nZPG8QaZrW8L77zpRbTzFCs3kjEqFfdm L4LbLSWYIsnSMS+NBLEdhZKS2ptgLp+i+40c0PGSLJnLoKVIHvBV4UyPL32xafwLu+jF 2C6Vgtud3xU+UclUMMmqbDWHa8LYYhy37ia5s0UZa12WTHk1h5adwlRimchOqwGKIU2/ Ys9yCID8h39tgnmWx7xGCc7ZgJgQwAJhcMxNdZTLTnXRloWm6o9K/rH2q2JNDtjhFZPL zeVg== X-Forwarded-Encrypted: i=1; AJvYcCU65COX9nlibgzCYM1N0HA3EhYZ4Sm00yk5Au0ySaNDjFGWY6AnC8nY2pcrFn654CmGrXtCtzmFq9+BVO4=@vger.kernel.org X-Gm-Message-State: AOJu0YxVQYrxtHyKJnjp56xEF7xoYtgfREX/SK5McZuD1gk1XNwphR53 Ux39HH9+fP2oeHB6jIqC+JlmPkz0wYXsC3uc0+A7f+mndh4M+qWHUMMrjEUSIIEQKXg= X-Gm-Gg: ATEYQzxgaNqajIacQ4xWFsItYqxMLtXPzccJGKzzX+QtPUbVRXCrIRzFUmHMZw7e1tB SMlGJwNXP0kiUbM0AyroLqtbDuBvZxTXQMQ59c8wiS5dcbgywY7ExFx42RyN3HHXsDDiFhneGR0 oY2aJnDh7z5e5DZZdjRFQSAVl8hMbSSy3zAfTxiOfzS5K5vaug3BQedv+qVhwD9lsYtTgHiKSIR NbYOUSGgizpRzShe4TxtzklrzQi7xRjj4/vOLPxHMSSSgxVmdHjUIU+DFdB9czVZpY1133mh8GS ohFLBQ1k25WIHTtV4+NWZ240luae3CMO9SSywSkuwnyoFYf84KBNPU78uI5IUupIjtNJTF/OGyB RhoeB7IUGjkxyOQKfETwnIQAijPEsn9egTaRQZt656ehcidYzCP7esUQWfjmsT+2J2+OXhPlnsT uV/sIHZCR+sH696Q== X-Received: by 2002:a05:600c:6309:b0:47e:e87f:4bba with SMTP id 5b1f17b1804b1-483c9bc6164mr55009605e9.29.1772212845194; Fri, 27 Feb 2026 09:20:45 -0800 (PST) Received: from [127.0.1.1] ([2a00:1098:3142:e::8]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-483bfeb932bsm60828075e9.28.2026.02.27.09.20.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 09:20:44 -0800 (PST) From: Dave Stevenson Date: Fri, 27 Feb 2026 17:19:08 +0000 Subject: [PATCH v5 3/6] media: ioctl: Add pixel formats NV12MT_COL128 and NV12MT_10_COL128 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260227-media-rpi-hevc-dec-v5-3-9bb3fc1816de@raspberrypi.com> References: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> In-Reply-To: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> To: Sakari Ailus , Laurent Pinchart , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Florian Fainelli , Broadcom internal kernel review list , John Cox , Dom Cobley , review list , Ezequiel Garcia Cc: Nicolas Dufresne , John Cox , Stefan Wahren , linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Dave Stevenson X-Mailer: b4 0.14.1 Add V4L2_PIXFMT_NV12MT_COL128 and V4L2_PIXFMT_NV12MT_10_COL128 to describe the Raspberry Pi HEVC decoder NV12 multiplanar formats. NV12MT_COL128 has been added to v4l2_format_info. NV12MT_10_COL128 has not as the block width is not a power of 2, but the framework uses ALIGN with the value. Signed-off-by: Dave Stevenson --- drivers/media/v4l2-core/v4l2-common.c | 2 ++ drivers/media/v4l2-core/v4l2-ioctl.c | 2 ++ include/uapi/linux/videodev2.h | 4 ++++ 3 files changed, 8 insertions(+) diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-cor= e/v4l2-common.c index 554c591e1113..20a7066df570 100644 --- a/drivers/media/v4l2-core/v4l2-common.c +++ b/drivers/media/v4l2-core/v4l2-common.c @@ -311,6 +311,8 @@ const struct v4l2_format_info *v4l2_format_info(u32 for= mat) { .format =3D V4L2_PIX_FMT_NV15_4L4, .pixel_enc =3D V4L2_PIXEL_ENC_YUV, = .mem_planes =3D 1, .comp_planes =3D 2, .bpp =3D { 5, 10, 0, 0 }, .bpp_div = =3D { 4, 4, 1, 1 }, .hdiv =3D 2, .vdiv =3D 2, .block_w =3D { 4, 2, 0, 0 }, .block_h =3D { 1, 1, 0, 0 }}, { .format =3D V4L2_PIX_FMT_P010_4L4, .pixel_enc =3D V4L2_PIXEL_ENC_YUV, = .mem_planes =3D 1, .comp_planes =3D 2, .bpp =3D { 2, 4, 0, 0 }, .bpp_div = =3D { 1, 1, 1, 1 }, .hdiv =3D 2, .vdiv =3D 2 }, + { .format =3D V4L2_PIX_FMT_NV12MT_COL128, .pixel_enc =3D V4L2_PIXEL_ENC_= YUV, .mem_planes =3D 2, .comp_planes =3D 2, .bpp =3D { 1, 2, 0, 0 }, .bpp_d= iv =3D { 1, 1, 1, 1 }, .hdiv =3D 2, .vdiv =3D 2 }, + /* V4L2_PIX_FMT_NV12MT_10_COL128 can not be described within the current= constraints of v4l2_format_info as 96 pixels is not a power of 2 */ =20 /* YUV planar formats, non contiguous variant */ { .format =3D V4L2_PIX_FMT_YUV420M, .pixel_enc =3D V4L2_PIXEL_ENC_YUV, .= mem_planes =3D 3, .comp_planes =3D 3, .bpp =3D { 1, 1, 1, 0 }, .bpp_div =3D= { 1, 1, 1, 1 }, .hdiv =3D 2, .vdiv =3D 2 }, diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core= /v4l2-ioctl.c index 37d33d4a363d..2fe8f591cdb3 100644 --- a/drivers/media/v4l2-core/v4l2-ioctl.c +++ b/drivers/media/v4l2-core/v4l2-ioctl.c @@ -1379,7 +1379,9 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt) case V4L2_PIX_FMT_NV16M: descr =3D "Y/UV 4:2:2 (N-C)"; break; case V4L2_PIX_FMT_NV61M: descr =3D "Y/VU 4:2:2 (N-C)"; break; case V4L2_PIX_FMT_NV12MT: descr =3D "Y/UV 4:2:0 (64x32 MB, N-C)"; break; + case V4L2_PIX_FMT_NV12MT_COL128: descr =3D "Y/CbCr 4:2:0 (128b cols)"; br= eak; case V4L2_PIX_FMT_NV12MT_16X16: descr =3D "Y/UV 4:2:0 (16x16 MB, N-C)"; b= reak; + case V4L2_PIX_FMT_NV12MT_10_COL128: descr =3D "10-bit Y/CbCr 4:2:0 (128b = cols)"; break; case V4L2_PIX_FMT_P012M: descr =3D "12-bit Y/UV 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_YUV420M: descr =3D "Planar YUV 4:2:0 (N-C)"; break; case V4L2_PIX_FMT_YVU420M: descr =3D "Planar YVU 4:2:0 (N-C)"; break; diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h index eda4492e40dc..e466c816ff2f 100644 --- a/include/uapi/linux/videodev2.h +++ b/include/uapi/linux/videodev2.h @@ -697,6 +697,10 @@ struct v4l2_pix_format { #define V4L2_PIX_FMT_NV12MT_16X16 v4l2_fourcc('V', 'M', '1', '2') /* 12 Y= /CbCr 4:2:0 16x16 tiles */ #define V4L2_PIX_FMT_NV12M_8L128 v4l2_fourcc('N', 'A', '1', '2') /* Y= /CbCr 4:2:0 8x128 tiles */ #define V4L2_PIX_FMT_NV12M_10BE_8L128 v4l2_fourcc_be('N', 'T', '1', '2') /= * Y/CbCr 4:2:0 10-bit 8x128 tiles */ +#define V4L2_PIX_FMT_NV12MT_COL128 v4l2_fourcc('N', 'c', '1', '2') /* 12 = Y/CbCr 4:2:0 128 pixel wide column */ +#define V4L2_PIX_FMT_NV12MT_10_COL128 v4l2_fourcc('N', 'c', '3', '0') + /* Y/CbCr 4:2:0 10bpc, 3x10 packed as 4 bytes in a 128 bytes / 96 pixel= wide column */ + =20 /* Bayer formats - see http://www.siliconimaging.com/RGB%20Bayer.htm */ #define V4L2_PIX_FMT_SBGGR8 v4l2_fourcc('B', 'A', '8', '1') /* 8 BGBG..= GRGR.. */ --=20 2.34.1 From nobody Thu Apr 16 12:32:24 2026 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AF2F44104B for ; Fri, 27 Feb 2026 17:20:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212852; cv=none; b=sAZJSQNfk+stBv/ohUw7bbKlPwrVuITotmS29ghI45asauZrgqhUpXKp7DZNlnzNknOc3PTLQ7d+bw/aBGp/urbOVpfKJ1hLA85Ww71NTVF3M0RruBWHK1YdGyouZHtM5GdBfwvjTUKvaLfc5jJfkOaW5lILmKhgGVGCPQpNWJ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212852; c=relaxed/simple; bh=YyrwNg/nIALl0JH9/+pqVw4xVsbzPlLG6biaxdSc1iI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=PyuRSJf6+SaCGhNFyxG+i40/d5Cv4p69tkECdtjMgodbNuYecjOu1WYUh1p3IKc6BoHAXEgsN3BRUmzv/eycJG/Xy27GQBCB42I1hhgGc4eJYzwU4z7Y8UU9Z/Peh5kdelwM/hM8khZ7Z+4buqXGA8HUUNMaa0abXN1TSeuQztw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com; spf=pass smtp.mailfrom=raspberrypi.com; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b=dPtlGiZM; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b="dPtlGiZM" Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-483a233819aso22993805e9.3 for ; Fri, 27 Feb 2026 09:20:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1772212846; x=1772817646; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=NFhgYto7McT5hzgK39BtOka2li4gXIYQIg2cIWRlQQI=; b=dPtlGiZMtckM66FqHvQWHEyWQhbLw4gIXVOG4dOokm63Fy7FHnQNBezcikOkJ0l+v1 mvjiiVkOOmT5NgiYzMGfFEvtMaxV6ytRWwh2r66LFpFQw1pyr33hqIuT+apKboHCojow EdL9jqiBWkigoMpZJ1LEHgkUmjCyq3BBFMD6F4yK+/mXwH+uXuvXNVxgL2FXa2n5bZ1r eU5S69ya1dwXm9AYbS9UcBZD1irW+o+ef20pB1kK6DP3G4inhAdvkuGfLgd8gP/NLWZF wWpOrymEFEDFfbUAShBXZHoYPp8G5vK0WgGWURMEKW4VoZCUmnrzxCKD6M2Lx4iojoSI w2VA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772212846; x=1772817646; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=NFhgYto7McT5hzgK39BtOka2li4gXIYQIg2cIWRlQQI=; b=wP1Oo8nlIU1AOBtBn53doKV3zKPHVc+kUTck8Cp6DxXBBckmi6ndkck2swD22aZj8n zFGQULkStZfmrD9Z8/XzOo+AalNxN0TkBiUXI1C0a1Mh+vu3AuZ5EwIV/7qPLPhj1VhG dt+U/tTaWJZHE8SLhZam0Vz1wpT4GG6q3E47gqvCsQVo+GHphKOj67R8/GShShleZogD W8fhUnJxmEK3Lr86jKqZqJ6gwsKDk99B4iwbeUlxFyd8g+tU+sacY1oI1hW6MywZCzTe oSqJMY5kBVdM1ZjZhZELkijZHI/+RzbdjhtIXNv+GYae48wGAvIHYJF1KvkyUBocuVGJ 7l3g== X-Forwarded-Encrypted: i=1; AJvYcCXziihSQvGWWsHOyAhN4z244RGenp/SxpWY7EhkUsfjIr8QAPGgjuzLfo9KuDxkDOny0pyMYfgl0kt8tmU=@vger.kernel.org X-Gm-Message-State: AOJu0YySyI8FngUGgrpDvjsVO+KRQ7lltwB+8rkUT6mtXIHQEJsaxop+ 2e7balWnv5HH6uUwwLzgoaLHXji/7CjOywkRXqPSO1sWXrIiSNBXpwmm9a7XixITeGg= X-Gm-Gg: ATEYQzxrjXjwtoy2Q43t2TWnt5xhonpVpsjiXdPHx/8NmTYMCXnBt8DeRJrnAUhvkdt eLIz/C9xWls9LYdiXg2bmDGva9bitVE4P4Kz+F1RG6CDhtq6Thi1hLb1iu4idQRhnSN6VgrNXdv DkajoQdI8rB4ayTdMD9+2mZFzsYqnJzG5lIgrSd60g2vH/RNST6VWvC4eezS4MDnSsaREdREtqg UX8MyGiNHK8pbWie5RJ1Wy/rvD7uqUnizTtjTd6Lxv/QK7I6GJF4WfaquhrYYCBCWIC+kwkPt6B uZOxViuFCW3d4LhFBnCUx2vnLiZUyTgcdkSgexQtPxYEnkthjKvb20WN2YvNeoRjkZedtWYV8Fl ezdbh+LbvL4brlhy2myvSgx6eyEUjl8LDal/48FhALaixhZNgDFoRZ85+VFS01pO4hdA1Mt2KnV RCSTT4xqMfpl9FoiNIW0FZVYO2 X-Received: by 2002:a05:600c:4e08:b0:483:c12b:fe4a with SMTP id 5b1f17b1804b1-483c9bc1b64mr53097065e9.11.1772212846166; Fri, 27 Feb 2026 09:20:46 -0800 (PST) Received: from [127.0.1.1] ([2a00:1098:3142:e::8]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-483bfeb932bsm60828075e9.28.2026.02.27.09.20.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 09:20:45 -0800 (PST) From: Dave Stevenson Date: Fri, 27 Feb 2026 17:19:09 +0000 Subject: [PATCH v5 4/6] dt-bindings: media: Add the Raspberry Pi HEVC decoder Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260227-media-rpi-hevc-dec-v5-4-9bb3fc1816de@raspberrypi.com> References: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> In-Reply-To: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> To: Sakari Ailus , Laurent Pinchart , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Florian Fainelli , Broadcom internal kernel review list , John Cox , Dom Cobley , review list , Ezequiel Garcia Cc: Nicolas Dufresne , John Cox , Stefan Wahren , linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Dave Stevenson , Krzysztof Kozlowski X-Mailer: b4 0.14.1 Adds a binding for the HEVC decoder IP owned by Raspberry Pi. Instantiations of the decoder IP can currently be found in the Broadcom BCM2711 and BCM2712 SoCs. Reviewed-by: Krzysztof Kozlowski Signed-off-by: Dave Stevenson --- .../bindings/media/raspberrypi,hevc-dec.yaml | 72 ++++++++++++++++++= ++++ MAINTAINERS | 9 +++ 2 files changed, 81 insertions(+) diff --git a/Documentation/devicetree/bindings/media/raspberrypi,hevc-dec.y= aml b/Documentation/devicetree/bindings/media/raspberrypi,hevc-dec.yaml new file mode 100644 index 000000000000..fe3361bddd1f --- /dev/null +++ b/Documentation/devicetree/bindings/media/raspberrypi,hevc-dec.yaml @@ -0,0 +1,72 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/media/raspberrypi,hevc-dec.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Raspberry Pi HEVC Decoder + +maintainers: + - John Cox + - Dom Cobley + - Dave Stevenson + +description: + The Raspberry Pi HEVC decoder is a hardware video decode accelerator IP = block + developed and owned by Raspberry Pi. + + Currently it can be found in the Broadcom BCM2711 and BCM2712 processors= used + on Raspberry Pi 4 and 5 boards respectively. + +properties: + compatible: + oneOf: + - const: brcm,bcm2711-hevc-dec + - items: + - enum: + - brcm,bcm2712-hevc-dec + - const: brcm,bcm2711-hevc-dec + + reg: + items: + - description: The HEVC main register region + - description: The Interrupt control register region + + reg-names: + items: + - const: hevc + - const: intc + + interrupts: + maxItems: 1 + + clocks: + items: + - description: The HEVC block clock + +required: + - compatible + - reg + - reg-names + - interrupts + - clocks + +additionalProperties: false + +examples: + - | + #include + + video-codec@7eb10000 { + compatible =3D "brcm,bcm2711-hevc-dec"; + reg =3D <0x7eb00000 0x10000>, /* HEVC */ + <0x7eb10000 0x1000>; /* INTC */ + reg-names =3D "hevc", + "intc"; + + interrupts =3D ; + + clocks =3D <&clk 0>; + }; + +... diff --git a/MAINTAINERS b/MAINTAINERS index 55af015174a5..5d9a495a2b34 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -22044,6 +22044,15 @@ L: linux-edac@vger.kernel.org S: Maintained F: drivers/ras/amd/fmpm.c =20 +RASPBERRY PI HEVC DECODER +M: John Cox +M: Dom Cobley +M: Dave Stevenson +M: Raspberry Pi Internal Kernel List +L: linux-media@vger.kernel.org +S: Maintained +F: Documentation/devicetree/bindings/media/raspberrypi,hevc-dec.yaml + RASPBERRY PI PISP BACK END M: Jacopo Mondi R: Raspberry Pi Kernel Maintenance --=20 2.34.1 From nobody Thu Apr 16 12:32:24 2026 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75C46449ECD for ; Fri, 27 Feb 2026 17:20:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212865; cv=none; b=QsKhiJ+fy5HE4ZawP8BnHL1sh4NyyZt4Qx2/KjIvo8D7EBRXufstgoSoF4sd2bvGENBXQjFOQV3Aikts4uqf8t7dA+gczaXEq5k19TUGf2+YGPXT2n5bDwyReIb0ftf5TAyWb5xJkLQmj43W2NcVrBYCC56Qyut7H9pIDY7qK/Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212865; c=relaxed/simple; bh=aiNlgIBe9UnYEi9pKwI3US7zVhGpoA7n+bL7rU8s6DU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VmzOE82x9A2fuvKjlKuKxzl48QjqDuCK5SnvgnNKh/yunX6laikR3gwbFZ8xxxGhj9RHcWWtrJPK1abV90RbFLdLxhMYawImjp4O5m4mOeul3D7NsyiqNXtJFrJjwKt2bxCbLzx31X8ngnzCRGBYylbN5wkNofvvPWRhUPbunyk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com; spf=pass smtp.mailfrom=raspberrypi.com; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b=HQNiIX5E; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b="HQNiIX5E" Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-48372efa020so18661635e9.2 for ; Fri, 27 Feb 2026 09:20:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1772212848; x=1772817648; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=q5Lvtu0mAnbAWVw7IBWHLGKPjZK5z/iWU+ccqmA4Ddo=; b=HQNiIX5EKK2uspLdT4owfpjgyRzFLVu0ha5wKWCyhdYKfQ3K6j8FFIMhlqT323vs9N ZC/qtY1PRcUkOUnFU5jCWOKVyaUWVAtxXHWLTo3xBdiLn/3vrgU1wFcZfg+Kf3ovtJUo /FZul4u9FJ2gX3o0bhLN+STFWnqrF3u5CBZDwIf5qx0vePkcsqZsjhYh2L+BQo7Uisn2 fiAQG5Do6XJA9hPpKtG+yEGHzqSsRWet9+tl6/hOUP0fpKOCcDMvuoF8mUjPemXYjJE9 2LVaknuq1oulG1dRYZRuM+m/MqAiorWZC2JFU9Mw3xcn8sYBO4s3X29lDrHW/bt0KX8P 2b4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772212848; x=1772817648; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=q5Lvtu0mAnbAWVw7IBWHLGKPjZK5z/iWU+ccqmA4Ddo=; b=pk9yk03wRELv8hrnVKJWGybFt+/hJ93yX0lkm5pPs2KjJa3YThobxJMiwgtht6zmiB 2gqaIMnce4GeYi9fe0yUhBTen3UVui7Gpfg61EuBWk3lkYt93TTTmSz9n3sAYVOTOjdY v1JSOkHP9NPKPcPrDcWZqjRiadu431NBEdlJCVS7O1HpdCmSljibggSCy1+L9lWqXTSG cmM7fQt2KMfBbdZZlGAmRwYyOmpb2r7etIuUvSW7M3dJm62D7z24LHuzhZMv9E+AUm9J ldURL1hJhhonKiXZ2baIUzw5lU/kmy/SzTxnl8/+hlwbCR0C427pC3Kj4OyYtT+TR1Ii GcTQ== X-Forwarded-Encrypted: i=1; AJvYcCWgo4seUtCJ3m8p0YAzQsmdY3a3mrng97MiO3FpmjMEpUNTrFF3gnW6y7kaxi8akpnq+ey7orNQgseOhmA=@vger.kernel.org X-Gm-Message-State: AOJu0YxBglgp7eA0Jgf6QIxzYORt0hi83o58/RMqhG2C+uJuYkmvjYPQ FXVo2gZbRJgDIcppQRg4tx6boCSVwWVTs1ji6mkrm11+BQ10kPPbOkdvolwThv7LhnU= X-Gm-Gg: ATEYQzwmQVcYLMdgbrKLr0P+1CMmvFh591QLvFJB7vSFLsZH6q38gIO9xDDM6VLX8JC kbO/E5ydTwtU6s4v3JwgqqlB+cSR5JEy/BGhdAcB1/IYQLJHUWmCCkhFVhkxpAy3mXARPxQrbU4 HTgRvy3cB9rbx2dHq+EMvvAzYiwmpnSRK+gFHNd+WUdAudlpmN3eWZrM3gekr8eewqzD9gatuYt e7N6h8bWHUhg6hrPyH5tKtnyc9b+h6Z3+id9kRxtoJ8f9ns/Mb1t3EOqa6lUQlfoTSkDT8H64jx BOss3NNv1RlCZNUIKYBT4wRFw7IYvWYZI/5ImZI54vxDDRiy+ajF/Fp6jMapRx4Mc0yFcFPQ1bc U6rrt1jiplobq80SmRCok2YZ1kQCdnijXqLDKq0iiZf7d/J9EiMbFmPbtOGyNwMFA+0dtyTBvVS YLzi2us1U4/nwXww== X-Received: by 2002:a05:600c:444f:b0:483:702d:2df with SMTP id 5b1f17b1804b1-483c9c1cc7dmr50744645e9.32.1772212847710; Fri, 27 Feb 2026 09:20:47 -0800 (PST) Received: from [127.0.1.1] ([2a00:1098:3142:e::8]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-483bfeb932bsm60828075e9.28.2026.02.27.09.20.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 09:20:46 -0800 (PST) From: Dave Stevenson Date: Fri, 27 Feb 2026 17:19:10 +0000 Subject: [PATCH v5 5/6] media: platform: Add Raspberry Pi HEVC decoder driver Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260227-media-rpi-hevc-dec-v5-5-9bb3fc1816de@raspberrypi.com> References: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> In-Reply-To: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> To: Sakari Ailus , Laurent Pinchart , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Florian Fainelli , Broadcom internal kernel review list , John Cox , Dom Cobley , review list , Ezequiel Garcia Cc: Nicolas Dufresne , John Cox , Stefan Wahren , linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Dave Stevenson X-Mailer: b4 0.14.1 From: John Cox The BCM2711 and BCM2712 SoCs used on Rapsberry Pi 4 and Raspberry Pi 5 boards include an HEVC decoder block. Add a driver for it. Signed-off-by: John Cox Signed-off-by: Dave Stevenson --- MAINTAINERS | 1 + drivers/media/platform/raspberrypi/Kconfig | 1 + drivers/media/platform/raspberrypi/Makefile | 1 + .../media/platform/raspberrypi/hevc_dec/Kconfig | 17 + .../media/platform/raspberrypi/hevc_dec/Makefile | 5 + .../media/platform/raspberrypi/hevc_dec/hevc_d.c | 326 +++ .../media/platform/raspberrypi/hevc_dec/hevc_d.h | 195 ++ .../platform/raspberrypi/hevc_dec/hevc_d_h265.c | 2436 ++++++++++++++++= ++++ .../platform/raspberrypi/hevc_dec/hevc_d_h265.h | 22 + .../platform/raspberrypi/hevc_dec/hevc_d_hw.c | 429 ++++ .../platform/raspberrypi/hevc_dec/hevc_d_hw.h | 317 +++ .../platform/raspberrypi/hevc_dec/hevc_d_video.c | 634 +++++ .../platform/raspberrypi/hevc_dec/hevc_d_video.h | 38 + 13 files changed, 4422 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 5d9a495a2b34..ea70e38bbb11 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -22052,6 +22052,7 @@ M: Raspberry Pi Internal Kernel List L: linux-media@vger.kernel.org S: Maintained F: Documentation/devicetree/bindings/media/raspberrypi,hevc-dec.yaml +F: drivers/media/platform/raspberrypi/hevc_dec =20 RASPBERRY PI PISP BACK END M: Jacopo Mondi diff --git a/drivers/media/platform/raspberrypi/Kconfig b/drivers/media/pla= tform/raspberrypi/Kconfig index bd5101ffefb5..fbe2edd24f8e 100644 --- a/drivers/media/platform/raspberrypi/Kconfig +++ b/drivers/media/platform/raspberrypi/Kconfig @@ -2,5 +2,6 @@ =20 comment "Raspberry Pi media platform drivers" =20 +source "drivers/media/platform/raspberrypi/hevc_dec/Kconfig" source "drivers/media/platform/raspberrypi/pisp_be/Kconfig" source "drivers/media/platform/raspberrypi/rp1-cfe/Kconfig" diff --git a/drivers/media/platform/raspberrypi/Makefile b/drivers/media/pl= atform/raspberrypi/Makefile index af7fde84eefe..c63985cd8d17 100644 --- a/drivers/media/platform/raspberrypi/Makefile +++ b/drivers/media/platform/raspberrypi/Makefile @@ -1,4 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 =20 +obj-y +=3D hevc_dec/ obj-y +=3D pisp_be/ obj-y +=3D rp1-cfe/ diff --git a/drivers/media/platform/raspberrypi/hevc_dec/Kconfig b/drivers/= media/platform/raspberrypi/hevc_dec/Kconfig new file mode 100644 index 000000000000..ae1fd079e5c9 --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/Kconfig @@ -0,0 +1,17 @@ +# SPDX-License-Identifier: GPL-2.0 + +config VIDEO_RPI_HEVC_DEC + tristate "Rasperry Pi HEVC decoder" + depends on VIDEO_DEV && VIDEO_DEV + depends on OF + select MEDIA_CONTROLLER + select MEDIA_CONTROLLER_REQUEST_API + select VIDEOBUF2_DMA_CONTIG + select V4L2_MEM2MEM_DEV + help + Support for the Raspberry Pi HEVC / H265 H/W decoder as a stateless + V4L2 decoder device. + + To compile this driver as a module, choose M here: the module + will be called rpi-hevc-dec. + diff --git a/drivers/media/platform/raspberrypi/hevc_dec/Makefile b/drivers= /media/platform/raspberrypi/hevc_dec/Makefile new file mode 100644 index 000000000000..b6506bf2e00d --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_VIDEO_RPI_HEVC_DEC) +=3D rpi-hevc-dec.o + +rpi-hevc-dec-y =3D hevc_d.o hevc_d_video.o hevc_d_hw.o\ + hevc_d_h265.o diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d.c b/drivers= /media/platform/raspberrypi/hevc_dec/hevc_d.c new file mode 100644 index 000000000000..6bf3e251e2a7 --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d.c @@ -0,0 +1,326 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ + +#include +#include +#include + +#include +#include +#include +#include + +#include "hevc_d.h" +#include "hevc_d_h265.h" +#include "hevc_d_video.h" +#include "hevc_d_hw.h" + +int hevc_d_v4l2_debug; +module_param_named(debug, hevc_d_v4l2_debug, int, 0644); +MODULE_PARM_DESC(debug, "Debug level 0-2"); + +static const struct v4l2_ctrl_config hevc_d_ctrls[] =3D { + { + .id =3D V4L2_CID_STATELESS_HEVC_SPS, + .ops =3D &hevc_d_hevc_sps_ctrl_ops, + }, { + .id =3D V4L2_CID_STATELESS_HEVC_PPS, + .ops =3D &hevc_d_hevc_pps_ctrl_ops, + }, { + .id =3D V4L2_CID_STATELESS_HEVC_SCALING_MATRIX, + }, { + .id =3D V4L2_CID_STATELESS_HEVC_DECODE_PARAMS, + }, { + .name =3D "Slice param array", + .id =3D V4L2_CID_STATELESS_HEVC_SLICE_PARAMS, + .type =3D V4L2_CTRL_TYPE_HEVC_SLICE_PARAMS, + .flags =3D V4L2_CTRL_FLAG_DYNAMIC_ARRAY, + .dims =3D { 600 }, + }, { + .id =3D V4L2_CID_STATELESS_HEVC_DECODE_MODE, + .min =3D V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED, + .max =3D V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED, + .def =3D V4L2_STATELESS_HEVC_DECODE_MODE_FRAME_BASED, + }, { + .id =3D V4L2_CID_STATELESS_HEVC_START_CODE, + .min =3D V4L2_STATELESS_HEVC_START_CODE_NONE, + .max =3D V4L2_STATELESS_HEVC_START_CODE_ANNEX_B, + .def =3D V4L2_STATELESS_HEVC_START_CODE_NONE, + }, +}; + +void *hevc_d_find_control_data(struct hevc_d_ctx *ctx, u32 id) +{ + struct v4l2_ctrl *const ctrl =3D v4l2_ctrl_find(ctx->fh.ctrl_handler, id); + + return ctrl ? ctrl->p_cur.p : NULL; +} + +static int hevc_d_init_ctrls(struct hevc_d_dev *dev, struct hevc_d_ctx *ct= x) +{ + struct v4l2_ctrl_handler *hdl =3D &ctx->hdl; + struct v4l2_ctrl *ctrl; + unsigned int i; + + v4l2_ctrl_handler_init(hdl, ARRAY_SIZE(hevc_d_ctrls)); + if (hdl->error) { + v4l2_err(&dev->v4l2_dev, + "Failed to initialize control handler\n"); + return hdl->error; + } + + for (i =3D 0; i < ARRAY_SIZE(hevc_d_ctrls); i++) { + ctrl =3D v4l2_ctrl_new_custom(hdl, &hevc_d_ctrls[i], ctx); + if (hdl->error) { + v4l2_err(&dev->v4l2_dev, + "Failed to create new custom control id=3D%#x\n", + hevc_d_ctrls[i].id); + + v4l2_ctrl_handler_free(hdl); + return hdl->error; + } + } + + ctx->fh.ctrl_handler =3D hdl; + v4l2_ctrl_handler_setup(hdl); + + return 0; +} + +static int hevc_d_open(struct file *file) +{ + struct hevc_d_dev *dev =3D video_drvdata(file); + struct hevc_d_ctx *ctx =3D NULL; + int ret; + + ctx =3D kzalloc_obj(*ctx, GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + mutex_init(&ctx->ctx_mutex); + + v4l2_fh_init(&ctx->fh, video_devdata(file)); + ctx->dev =3D dev; + + ctx->fh.m2m_ctx =3D v4l2_m2m_ctx_init(dev->m2m_dev, ctx, + &hevc_d_queue_init); + if (IS_ERR(ctx->fh.m2m_ctx)) { + ret =3D PTR_ERR(ctx->fh.m2m_ctx); + goto err_free; + } + + /* The only bit of format info that we can guess now is H265 src + * Everything else we need more info for + */ + hevc_d_prepare_src_format(&ctx->src_fmt); + + ret =3D hevc_d_init_ctrls(dev, ctx); + if (ret) + goto err_ctx; + + v4l2_fh_add(&ctx->fh, file); + return 0; + +err_ctx: + v4l2_m2m_ctx_release(ctx->fh.m2m_ctx); +err_free: + mutex_destroy(&ctx->ctx_mutex); + kfree(ctx); + return ret; +} + +static int hevc_d_release(struct file *file) +{ + struct hevc_d_ctx *ctx =3D container_of(file->private_data, + struct hevc_d_ctx, fh); + + v4l2_fh_del(&ctx->fh, file); + + v4l2_ctrl_handler_free(&ctx->hdl); + + v4l2_m2m_ctx_release(ctx->fh.m2m_ctx); + + v4l2_fh_exit(&ctx->fh); + mutex_destroy(&ctx->ctx_mutex); + + kfree(ctx); + return 0; +} + +static void hevc_d_media_req_queue(struct media_request *req) +{ + media_request_mark_manual_completion(req); + v4l2_m2m_request_queue(req); +} + +static const struct v4l2_file_operations hevc_d_fops =3D { + .owner =3D THIS_MODULE, + .open =3D hevc_d_open, + .release =3D hevc_d_release, + .poll =3D v4l2_m2m_fop_poll, + .unlocked_ioctl =3D video_ioctl2, + .mmap =3D v4l2_m2m_fop_mmap, +}; + +static const struct video_device hevc_d_video_device =3D { + .name =3D HEVC_D_NAME, + .vfl_dir =3D VFL_DIR_M2M, + .fops =3D &hevc_d_fops, + .ioctl_ops =3D &hevc_d_ioctl_ops, + .minor =3D -1, + .release =3D video_device_release_empty, + .device_caps =3D V4L2_CAP_VIDEO_M2M_MPLANE | V4L2_CAP_STREAMING, +}; + +static const struct v4l2_m2m_ops hevc_d_m2m_ops =3D { + .device_run =3D hevc_d_device_run, +}; + +static const struct media_device_ops hevc_d_m2m_media_ops =3D { + .req_validate =3D vb2_request_validate, + .req_queue =3D hevc_d_media_req_queue, +}; + +static int hevc_d_probe(struct platform_device *pdev) +{ + struct hevc_d_dev *dev; + struct video_device *vfd; + int ret; + + dev =3D devm_kzalloc(&pdev->dev, sizeof(*dev), GFP_KERNEL); + if (!dev) + return -ENOMEM; + + dev->vfd =3D hevc_d_video_device; + dev->dev =3D &pdev->dev; + dev->pdev =3D pdev; + + ret =3D hevc_d_hw_probe(dev); + if (ret) { + dev_err(&pdev->dev, "Failed to probe hardware - %d\n", ret); + return ret; + } + + mutex_init(&dev->dev_mutex); + + ret =3D v4l2_device_register(&pdev->dev, &dev->v4l2_dev); + if (ret) { + dev_err(&pdev->dev, "Failed to register V4L2 device\n"); + return ret; + } + + vfd =3D &dev->vfd; + vfd->lock =3D &dev->dev_mutex; + vfd->v4l2_dev =3D &dev->v4l2_dev; + + snprintf(vfd->name, sizeof(vfd->name), "%s", hevc_d_video_device.name); + video_set_drvdata(vfd, dev); + + ret =3D dma_set_mask_and_coherent(dev->dev, DMA_BIT_MASK(36)); + if (ret) { + v4l2_err(&dev->v4l2_dev, + "Failed dma_set_mask_and_coherent\n"); + goto err_v4l2; + } + + dev->m2m_dev =3D v4l2_m2m_init(&hevc_d_m2m_ops); + if (IS_ERR(dev->m2m_dev)) { + v4l2_err(&dev->v4l2_dev, + "Failed to initialize V4L2 M2M device\n"); + ret =3D PTR_ERR(dev->m2m_dev); + + goto err_v4l2; + } + + dev->mdev.dev =3D &pdev->dev; + strscpy(dev->mdev.model, HEVC_D_NAME, sizeof(dev->mdev.model)); + strscpy(dev->mdev.bus_info, "platform:" HEVC_D_NAME, + sizeof(dev->mdev.bus_info)); + + media_device_init(&dev->mdev); + dev->mdev.ops =3D &hevc_d_m2m_media_ops; + dev->v4l2_dev.mdev =3D &dev->mdev; + + ret =3D video_register_device(vfd, VFL_TYPE_VIDEO, -1); + if (ret) { + v4l2_err(&dev->v4l2_dev, "Failed to register video device\n"); + goto err_m2m; + } + + v4l2_info(&dev->v4l2_dev, + "Device registered as /dev/video%d\n", vfd->num); + + ret =3D v4l2_m2m_register_media_controller(dev->m2m_dev, vfd, + MEDIA_ENT_F_PROC_VIDEO_DECODER); + if (ret) { + v4l2_err(&dev->v4l2_dev, + "Failed to initialize V4L2 M2M media controller\n"); + goto err_video; + } + + ret =3D media_device_register(&dev->mdev); + if (ret) { + v4l2_err(&dev->v4l2_dev, "Failed to register media device\n"); + goto err_m2m_mc; + } + + platform_set_drvdata(pdev, dev); + + return 0; + +err_m2m_mc: + v4l2_m2m_unregister_media_controller(dev->m2m_dev); +err_video: + video_unregister_device(&dev->vfd); +err_m2m: + v4l2_m2m_release(dev->m2m_dev); +err_v4l2: + v4l2_device_unregister(&dev->v4l2_dev); + + return ret; +} + +static void hevc_d_remove(struct platform_device *pdev) +{ + struct hevc_d_dev *dev =3D platform_get_drvdata(pdev); + + media_device_unregister(&dev->mdev); + v4l2_m2m_unregister_media_controller(dev->m2m_dev); + media_device_cleanup(&dev->mdev); + + v4l2_m2m_release(dev->m2m_dev); + video_unregister_device(&dev->vfd); + v4l2_device_unregister(&dev->v4l2_dev); + + hevc_d_hw_remove(dev); +} + +static const struct of_device_id hevc_d_dt_match[] =3D { + { .compatible =3D "brcm,bcm2711-hevc-dec", }, + { /* sentinel */ } +}; +MODULE_DEVICE_TABLE(of, hevc_d_dt_match); + +static struct platform_driver hevc_d_driver =3D { + .probe =3D hevc_d_probe, + .remove =3D hevc_d_remove, + .driver =3D { + .name =3D HEVC_D_NAME, + .of_match_table =3D of_match_ptr(hevc_d_dt_match), + }, +}; +module_platform_driver(hevc_d_driver); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("John Cox "); +MODULE_DESCRIPTION("Raspberry Pi HEVC V4L2 driver"); diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d.h b/drivers= /media/platform/raspberrypi/hevc_dec/hevc_d.h new file mode 100644 index 000000000000..92de9b869569 --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d.h @@ -0,0 +1,195 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ + +#ifndef _HEVC_D_H_ +#define _HEVC_D_H_ + +#include +#include +#include +#include +#include +#include +#include + +/* Decoder limits */ +#define HEVC_D_MIN_WIDTH 16U +#define HEVC_D_MIN_HEIGHT 16U +#define HEVC_D_DEFAULT_WIDTH 1920U +#define HEVC_D_DEFAULT_HEIGHT 1088U +#define HEVC_D_MAX_WIDTH 4096U +#define HEVC_D_MAX_HEIGHT 4096U + +/* + * Q sizes of 3 give one entry being prepared, one waiting and + * one processing. Testing shows no advantage to greater Q depths + */ + +/* + * Max processing Q size Phase 0 -> Phase 1 + * This is per open context + */ +#define HEVC_D_P1BUF_COUNT 3 +/* + * Max processing Q size Phase 1 -> Phase 2 + * This is per device + */ +#define HEVC_D_P2BUF_COUNT 3 +/* + * Number of decode environments a context has + * There is no independent flow control on this number so it must be + * capable of holding P1 + P2 entries. + */ +#define HEVC_D_DEC_ENV_COUNT (HEVC_D_P1BUF_COUNT + HEVC_D_P2BUF_COUNT) + +#define HEVC_D_NAME "rpi-hevc-dec" + +enum hevc_d_irq_status { + HEVC_D_IRQ_NONE, + HEVC_D_IRQ_ERROR, + HEVC_D_IRQ_OK, +}; + +struct hevc_d_h265_run { + u32 slice_ents; + const struct v4l2_ctrl_hevc_sps *sps; + const struct v4l2_ctrl_hevc_pps *pps; + const struct v4l2_ctrl_hevc_decode_params *dec; + const struct v4l2_ctrl_hevc_slice_params *slice_params; + const struct v4l2_ctrl_hevc_scaling_matrix *scaling_matrix; +}; + +struct hevc_d_run { + struct vb2_v4l2_buffer *src; + struct vb2_v4l2_buffer *dst; + + struct hevc_d_h265_run h265; +}; + +struct hevc_d_buffer { + struct v4l2_m2m_buffer m2m_buf; +}; + +struct hevc_d_dec_state; +struct hevc_d_dec_env; + +struct hevc_d_hwbuf { + size_t size; + u8 *ptr; + dma_addr_t addr; + unsigned long attrs; +}; + +struct hevc_d_dev; +typedef void (*hevc_d_irq_callback)(struct hevc_d_dev *dev, void *ctx); + +struct hevc_d_q_aux; +#define HEVC_D_AUX_ENT_COUNT VB2_MAX_FRAME + +struct hevc_d_ctx { + struct v4l2_fh fh; + struct hevc_d_dev *dev; + + struct v4l2_pix_format_mplane src_fmt; + struct v4l2_pix_format_mplane dst_fmt; + int dst_fmt_set; + + /* + * fatal_err is set if an error has occurred s.t. decode cannot + * continue (such as running out of CMA) + */ + int fatal_err; + + /* Lock for queue operations */ + struct mutex ctx_mutex; + + struct v4l2_ctrl_handler hdl; + + /* + * state contains stuff that is only needed in phase0 + * it could be held in dec_env but that would be wasteful + */ + struct hevc_d_dec_state *state; + struct hevc_d_dec_env *dec0; + + /* Spinlock protecting dec_free */ + spinlock_t dec_lock; + struct hevc_d_dec_env *dec_free; + + struct hevc_d_dec_env *dec_pool; + + atomic_t p1out; + + unsigned int p2idx; + struct hevc_d_hwbuf pu_bufs[HEVC_D_P2BUF_COUNT]; + struct hevc_d_hwbuf coeff_bufs[HEVC_D_P2BUF_COUNT]; + + /* Spinlock protecting aux_free */ + spinlock_t aux_lock; + struct hevc_d_q_aux *aux_free; + + struct hevc_d_q_aux *aux_ents[HEVC_D_AUX_ENT_COUNT]; + + unsigned int colmv_stride; + unsigned int colmv_picsize; +}; + +struct hevc_d_hw_irq_ent; + +#define HEVC_D_ICTL_ENABLE_UNLIMITED (-1) + +struct hevc_d_hw_irq_ctrl { + /* Spinlock protecting claim and tail */ + spinlock_t lock; + struct hevc_d_hw_irq_ent *claim; + struct hevc_d_hw_irq_ent *tail; + + /* Ent for pending irq - also prevents sched */ + struct hevc_d_hw_irq_ent *irq; + /* Non-zero =3D> do not start a new job - outer layer sched pending */ + int no_sched; + /* Enable count. -1 always OK, 0 do not sched, +ve shed & count down */ + int enable; + /* Thread CB requested */ + bool thread_reqed; +}; + +struct hevc_d_dev { + struct v4l2_device v4l2_dev; + struct video_device vfd; + struct media_device mdev; + struct platform_device *pdev; + struct device *dev; + struct v4l2_m2m_dev *m2m_dev; + + /* Video device file (vfd) mutex */ + struct mutex dev_mutex; + + void __iomem *base_irq; + void __iomem *base_h265; + + struct clk *clock; + unsigned long max_clock_rate; + + struct hevc_d_hw_irq_ctrl ic_active1; + struct hevc_d_hw_irq_ctrl ic_active2; +}; + +extern int hevc_d_v4l2_debug; +#define hevc_d_dbg(level, dev, fmt, arg...)\ + v4l2_dbg((level), hevc_d_v4l2_debug, (dev), fmt, ## arg) + +struct v4l2_ctrl *hevc_d_find_ctrl(struct hevc_d_ctx *ctx, u32 id); +void *hevc_d_find_control_data(struct hevc_d_ctx *ctx, u32 id); + +#endif diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_h265.c b/dr= ivers/media/platform/raspberrypi/hevc_dec/hevc_d_h265.c new file mode 100644 index 000000000000..c7bdb4139cd9 --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_h265.c @@ -0,0 +1,2436 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ + +#include +#include + +#include + +#include "hevc_d.h" +#include "hevc_d_h265.h" +#include "hevc_d_hw.h" +#include "hevc_d_video.h" + +/* Maximum length of command buffer before we rate it an error */ +#define CMD_BUFFER_SIZE_MAX 0x100000 + +/* Initial size of command FIFO in commands. + * The FIFO will be extended if this value is exceeded but 8192 seems to + * deal with all streams found in the wild. + */ +#define CMD_BUFFER_SIZE_INIT 8192 + +enum hevc_slice_type { + HEVC_SLICE_B =3D 0, + HEVC_SLICE_P =3D 1, + HEVC_SLICE_I =3D 2, +}; + +enum hevc_layer { L0 =3D 0, L1 =3D 1 }; + +static int hwbuf_alloc(struct hevc_d_dev *const dev, struct hevc_d_hwbuf *= hwbuf, + size_t size, unsigned long attrs) +{ + hwbuf->size =3D size; + hwbuf->attrs =3D attrs; + hwbuf->addr =3D 0; + hwbuf->ptr =3D dma_alloc_attrs(dev->dev, hwbuf->size, &hwbuf->addr, + GFP_KERNEL, hwbuf->attrs); + return !hwbuf->ptr ? -ENOMEM : 0; +} + +static void hwbuf_free(struct hevc_d_dev *const dev, + struct hevc_d_hwbuf *const hwbuf) +{ + if (hwbuf->ptr) + dma_free_attrs(dev->dev, hwbuf->size, hwbuf->ptr, hwbuf->addr, + hwbuf->attrs); + hwbuf->size =3D 0; + hwbuf->ptr =3D NULL; + hwbuf->addr =3D 0; + hwbuf->attrs =3D 0; +} + +/* Realloc but do not copy + * + * Frees then allocs. + * If the alloc fails then it attempts to re-allocate the old size + * On error check hwbuf->ptr to determine if anything is currently + * allocated. + */ +static int hwbuf_realloc_new(struct hevc_d_dev * const dev, + struct hevc_d_hwbuf * const hwbuf, size_t size) +{ + const size_t old_size =3D hwbuf->size; + + if (size =3D=3D hwbuf->size) + return 0; + + if (hwbuf->ptr) + dma_free_attrs(dev->dev, hwbuf->size, hwbuf->ptr, + hwbuf->addr, hwbuf->attrs); + + hwbuf->addr =3D 0; + hwbuf->size =3D size; + hwbuf->ptr =3D dma_alloc_attrs(dev->dev, hwbuf->size, + &hwbuf->addr, GFP_KERNEL, hwbuf->attrs); + + if (!hwbuf->ptr) { + hwbuf->addr =3D 0; + hwbuf->size =3D old_size; + hwbuf->ptr =3D dma_alloc_attrs(dev->dev, hwbuf->size, + &hwbuf->addr, GFP_KERNEL, hwbuf->attrs); + if (!hwbuf->ptr) { + hwbuf->size =3D 0; + hwbuf->addr =3D 0; + hwbuf->attrs =3D 0; + } + return -ENOMEM; + } + + return 0; +} + +/* Realloc with copy */ +static int hwbuf_realloc_copy(struct hevc_d_dev * const dev, + struct hevc_d_hwbuf * const hwbuf, size_t newsize) +{ + struct hevc_d_hwbuf bnew; + + if (newsize <=3D hwbuf->size) + return 0; + + if (hwbuf_alloc(dev, &bnew, newsize, hwbuf->attrs)) + return -ENOMEM; + + memcpy(bnew.ptr, hwbuf->ptr, hwbuf->size); + hwbuf_free(dev, hwbuf); + *hwbuf =3D bnew; + return 0; +} + +static size_t next_size(const size_t x) +{ + return hevc_d_round_up_size(x + 1); +} + +#define NUM_SCALING_FACTORS 4064 /* Not a typo =3D 0xbe0 + 0x400 */ + +#define AXI_BASE64 0 + +#define PROB_BACKUP ((20 << 12) + (20 << 6) + (0 << 0)) +#define PROB_RELOAD ((20 << 12) + (20 << 0) + (0 << 6)) + +#define HEVC_MAX_REFS V4L2_HEVC_DPB_ENTRIES_NUM_MAX + +struct hevc_d_q_aux { + unsigned int refcount; + unsigned int q_index; + struct hevc_d_q_aux *next; + struct hevc_d_hwbuf col; +}; + +enum hevc_d_decode_state { + HEVC_D_DECODE_SLICE_START, + HEVC_D_DECODE_ERROR_DONE, + HEVC_D_DECODE_PHASE1, + HEVC_D_DECODE_END, +}; + +/* + * Decode environment + * One dec_env is allocated per active frame decode and holds state that + * needs to persist between decode phases or callbacks + */ +struct hevc_d_dec_env { + struct hevc_d_ctx *ctx; + struct hevc_d_dec_env *next; + + enum hevc_d_decode_state state; + unsigned int decode_order; + int p1_status; /* Phase 1 status - what to realloc */ + + struct hevc_d_hwbuf cmd; + unsigned int cmd_len; + unsigned int cmd_max; + unsigned int num_slice_msgs; + unsigned int pic_width_in_ctbs_y; + unsigned int pic_height_in_ctbs_y; + unsigned int dpbno_col; + u32 reg_slicestart; + int collocated_from_l0_flag; + /* + * Last CTB/Tile X,Y processed by (wpp_)entry_point + * Could be in _state as P0 only but needs updating where _state + * is const + */ + unsigned int entry_ctb_x; + unsigned int entry_ctb_y; + unsigned int entry_tile_x; + unsigned int entry_tile_y; + unsigned int entry_qp; + u32 entry_slice; + + u32 rpi_config2; + u32 rpi_framesize; + u32 rpi_currpoc; + + struct vb2_v4l2_buffer *frame_buf; + struct vb2_v4l2_buffer *src_buf; + dma_addr_t frame_luma_addr; + unsigned int luma_stride; + dma_addr_t frame_chroma_addr; + unsigned int chroma_stride; + dma_addr_t ref_addrs[16][2]; + struct hevc_d_q_aux *frame_aux; + struct hevc_d_q_aux *col_aux; + + dma_addr_t pu_base_vc; + dma_addr_t coeff_base_vc; + u32 pu_stride; + u32 coeff_stride; + +#define SLICE_MSGS_MAX (2 * HEVC_MAX_REFS * 8 + 3) + u16 slice_msgs[SLICE_MSGS_MAX]; + u8 scaling_factors[NUM_SCALING_FACTORS]; + + struct media_request *request; + struct hevc_d_hw_irq_ent irq_ent; +}; + +/* + * Decode state + * Decode information that persists between frame decodes but is only + * used or changed in phase 0 (setup) + */ +struct hevc_d_dec_state { + struct v4l2_ctrl_hevc_sps sps; + struct v4l2_ctrl_hevc_pps pps; + + /* Helper vars & tables derived from sps/pps */ + unsigned int log2_ctb_size; /* log2 width of a CTB */ + unsigned int ctb_width; /* Width in CTBs */ + unsigned int ctb_height; /* Height in CTBs */ + unsigned int ctb_size; /* Pic area in CTBs */ + unsigned int tile_width; /* Width in tiles */ + unsigned int tile_height; /* Height in tiles */ + + int *col_bd; + int *row_bd; + int *ctb_addr_rs_to_ts; + int *ctb_addr_ts_to_rs; + + /* Aux storage for DPB */ + struct hevc_d_q_aux *ref_aux[HEVC_MAX_REFS]; + struct hevc_d_q_aux *frame_aux; + + /* Slice vars */ + unsigned int slice_idx; + unsigned int idx_inuse; + bool slice_temporal_mvp; /* Slice flag but constant for frame */ + bool use_aux; + bool mk_aux; + + /* Temp vars per run - don't actually need to persist */ + dma_addr_t src_addr; + const struct v4l2_ctrl_hevc_slice_params *sh; + const struct v4l2_ctrl_hevc_decode_params *dec; + unsigned int nb_refs[2]; + unsigned int slice_qp; + unsigned int max_num_merge_cand; /* 0 if I-slice */ + bool dependent_slice_segment_flag; + u32 data_len; + + unsigned int start_ts; /* slice_segment_addr -> ts */ + unsigned int start_ctb_x; /* CTB X,Y of start_ts */ + unsigned int start_ctb_y; + unsigned int prev_ctb_x; /* CTB X,Y of start_ts - 1 */ + unsigned int prev_ctb_y; +}; + +/* Phase 1 command and bit FIFOs */ +static int cmds_check_space(struct hevc_d_dec_env *const de, unsigned int = n) +{ + unsigned int newmax; + + if (de->cmd_len + n <=3D de->cmd_max) + return 0; + + newmax =3D roundup_pow_of_two(de->cmd_len + n); + if (newmax > CMD_BUFFER_SIZE_MAX) { + v4l2_err(&de->ctx->dev->v4l2_dev, + "%s: n %u implausible\n", __func__, newmax); + return -ENOMEM; + } + + if (hwbuf_realloc_copy(de->ctx->dev, &de->cmd, newmax * sizeof(u64))) { + v4l2_err(&de->ctx->dev->v4l2_dev, + "Failed cmd buffer realloc from %u to %u\n", + de->cmd_max, newmax); + return -ENOMEM; + } + hevc_d_dbg(1, &de->ctx->dev->v4l2_dev, + "cmd buffer realloc from %u to %u\n", de->cmd_max, newmax); + + de->cmd_max =3D newmax; + return 0; +} + +static void p1_apb_write(struct hevc_d_dec_env *const de, const u16 addr, + const u32 data) +{ + WRITE_ONCE(((u64 *)de->cmd.ptr)[de->cmd_len], addr | ((u64)data << 32)); + de->cmd_len++; +} + +static int ctb_to_tile(unsigned int ctb, unsigned int *bd, int num) +{ + int i; + + for (i =3D 1; ctb >=3D bd[i]; i++) + ; /* bd[] has num+1 elements; bd[0]=3D0; */ + + return i - 1; +} + +static unsigned int ctb_to_tile_x(const struct hevc_d_dec_state *const s, + const unsigned int ctb_x) +{ + return ctb_to_tile(ctb_x, s->col_bd, s->tile_width); +} + +static unsigned int ctb_to_tile_y(const struct hevc_d_dec_state *const s, + const unsigned int ctb_y) +{ + return ctb_to_tile(ctb_y, s->row_bd, s->tile_height); +} + +static void aux_q_free(struct hevc_d_ctx *const ctx, + struct hevc_d_q_aux *const aq) +{ + struct hevc_d_dev *const dev =3D ctx->dev; + + hwbuf_free(dev, &aq->col); + kfree(aq); +} + +static struct hevc_d_q_aux *aux_q_alloc(struct hevc_d_ctx *const ctx, + const unsigned int q_index) +{ + struct hevc_d_dev *const dev =3D ctx->dev; + struct hevc_d_q_aux *const aq =3D kzalloc_obj(*aq, GFP_KERNEL); + + if (!aq) + return NULL; + + if (hwbuf_alloc(dev, &aq->col, ctx->colmv_picsize, + DMA_ATTR_FORCE_CONTIGUOUS | DMA_ATTR_NO_KERNEL_MAPPING)) + goto fail; + + /* + * Spinlock not required as called in P0 only and + * aux checks done by _new + */ + aq->refcount =3D 1; + aq->q_index =3D q_index; + ctx->aux_ents[q_index] =3D aq; + return aq; + +fail: + kfree(aq); + return NULL; +} + +static struct hevc_d_q_aux *aux_q_new(struct hevc_d_ctx *const ctx, + const unsigned int q_index) +{ + struct hevc_d_q_aux *aq; + unsigned long lockflags; + + spin_lock_irqsave(&ctx->aux_lock, lockflags); + /* + * If we already have this allocated to a slot then use that + * and assume that it will all work itself out in the pipeline + */ + aq =3D ctx->aux_ents[q_index]; + if (aq) { + ++aq->refcount; + } else { + aq =3D ctx->aux_free; + if (aq) { + ctx->aux_free =3D aq->next; + aq->next =3D NULL; + aq->refcount =3D 1; + aq->q_index =3D q_index; + ctx->aux_ents[q_index] =3D aq; + } + } + spin_unlock_irqrestore(&ctx->aux_lock, lockflags); + + if (!aq) + aq =3D aux_q_alloc(ctx, q_index); + + return aq; +} + +static struct hevc_d_q_aux *aux_q_ref_idx(struct hevc_d_ctx *const ctx, + const int q_index) +{ + unsigned long lockflags; + struct hevc_d_q_aux *aq; + + spin_lock_irqsave(&ctx->aux_lock, lockflags); + aq =3D ctx->aux_ents[q_index]; + if (aq) + ++aq->refcount; + spin_unlock_irqrestore(&ctx->aux_lock, lockflags); + + return aq; +} + +static struct hevc_d_q_aux *aux_q_ref(struct hevc_d_ctx *const ctx, + struct hevc_d_q_aux *const aq) +{ + unsigned long lockflags; + + if (aq) { + spin_lock_irqsave(&ctx->aux_lock, lockflags); + ++aq->refcount; + spin_unlock_irqrestore(&ctx->aux_lock, lockflags); + } + return aq; +} + +static void aux_q_release(struct hevc_d_ctx *const ctx, + struct hevc_d_q_aux **const paq) +{ + struct hevc_d_q_aux *const aq =3D *paq; + unsigned long lockflags; + + if (!aq) + return; + + *paq =3D NULL; + + spin_lock_irqsave(&ctx->aux_lock, lockflags); + if (--aq->refcount =3D=3D 0) { + aq->next =3D ctx->aux_free; + ctx->aux_free =3D aq; + ctx->aux_ents[aq->q_index] =3D NULL; + aq->q_index =3D ~0U; + } + spin_unlock_irqrestore(&ctx->aux_lock, lockflags); +} + +static void aux_q_init(struct hevc_d_ctx *const ctx) +{ + spin_lock_init(&ctx->aux_lock); + ctx->aux_free =3D NULL; +} + +static void aux_q_uninit(struct hevc_d_ctx *const ctx) +{ + struct hevc_d_q_aux *aq; + + ctx->colmv_picsize =3D 0; + ctx->colmv_stride =3D 0; + while ((aq =3D ctx->aux_free) !=3D NULL) { + ctx->aux_free =3D aq->next; + aux_q_free(ctx, aq); + } +} + +/* + * Initialisation process for context variables (CABAC init) + * see H.265 9.3.2.2 + * + * N.B. If comparing with FFmpeg note that this h/w uses slightly different + * offsets to FFmpegs array + */ + +/* Actual number of values */ +#define RPI_PROB_VALS 154U +/* Rounded up as we copy words */ +#define RPI_PROB_ARRAY_SIZE ((154 + 3) & ~3) + +/* Initialiser values - see tables H.265 9-4 through 9-42 */ +static const u8 prob_init[3][156] =3D { + { + 153, 200, 139, 141, 157, 154, 154, 154, 154, 154, 184, 154, 154, + 154, 184, 63, 154, 154, 154, 154, 154, 154, 154, 154, 154, 154, + 154, 154, 154, 153, 138, 138, 111, 141, 94, 138, 182, 154, 154, + 154, 140, 92, 137, 138, 140, 152, 138, 139, 153, 74, 149, 92, + 139, 107, 122, 152, 140, 179, 166, 182, 140, 227, 122, 197, 110, + 110, 124, 125, 140, 153, 125, 127, 140, 109, 111, 143, 127, 111, + 79, 108, 123, 63, 110, 110, 124, 125, 140, 153, 125, 127, 140, + 109, 111, 143, 127, 111, 79, 108, 123, 63, 91, 171, 134, 141, + 138, 153, 136, 167, 152, 152, 139, 139, 111, 111, 125, 110, 110, + 94, 124, 108, 124, 107, 125, 141, 179, 153, 125, 107, 125, 141, + 179, 153, 125, 107, 125, 141, 179, 153, 125, 140, 139, 182, 182, + 152, 136, 152, 136, 153, 136, 139, 111, 136, 139, 111, 0, 0, + }, + { + 153, 185, 107, 139, 126, 197, 185, 201, 154, 149, 154, 139, 154, + 154, 154, 152, 110, 122, 95, 79, 63, 31, 31, 153, 153, 168, + 140, 198, 79, 124, 138, 94, 153, 111, 149, 107, 167, 154, 154, + 154, 154, 196, 196, 167, 154, 152, 167, 182, 182, 134, 149, 136, + 153, 121, 136, 137, 169, 194, 166, 167, 154, 167, 137, 182, 125, + 110, 94, 110, 95, 79, 125, 111, 110, 78, 110, 111, 111, 95, + 94, 108, 123, 108, 125, 110, 94, 110, 95, 79, 125, 111, 110, + 78, 110, 111, 111, 95, 94, 108, 123, 108, 121, 140, 61, 154, + 107, 167, 91, 122, 107, 167, 139, 139, 155, 154, 139, 153, 139, + 123, 123, 63, 153, 166, 183, 140, 136, 153, 154, 166, 183, 140, + 136, 153, 154, 166, 183, 140, 136, 153, 154, 170, 153, 123, 123, + 107, 121, 107, 121, 167, 151, 183, 140, 151, 183, 140, 0, 0, + }, + { + 153, 160, 107, 139, 126, 197, 185, 201, 154, 134, 154, 139, 154, + 154, 183, 152, 154, 137, 95, 79, 63, 31, 31, 153, 153, 168, + 169, 198, 79, 224, 167, 122, 153, 111, 149, 92, 167, 154, 154, + 154, 154, 196, 167, 167, 154, 152, 167, 182, 182, 134, 149, 136, + 153, 121, 136, 122, 169, 208, 166, 167, 154, 152, 167, 182, 125, + 110, 124, 110, 95, 94, 125, 111, 111, 79, 125, 126, 111, 111, + 79, 108, 123, 93, 125, 110, 124, 110, 95, 94, 125, 111, 111, + 79, 125, 126, 111, 111, 79, 108, 123, 93, 121, 140, 61, 154, + 107, 167, 91, 107, 107, 167, 139, 139, 170, 154, 139, 153, 139, + 123, 123, 63, 124, 166, 183, 140, 136, 153, 154, 166, 183, 140, + 136, 153, 154, 166, 183, 140, 136, 153, 154, 170, 153, 138, 138, + 122, 121, 122, 121, 167, 151, 183, 140, 151, 183, 140, 0, 0, + }, +}; + +#define CMDS_WRITE_PROB ((RPI_PROB_ARRAY_SIZE / 4) + 1) + +static void write_prob(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + const unsigned int init_type =3D + ((s->sh->flags & V4L2_HEVC_SLICE_PARAMS_FLAG_CABAC_INIT) !=3D 0 && + s->sh->slice_type !=3D HEVC_SLICE_I) ? + s->sh->slice_type + 1 : + 2 - s->sh->slice_type; + const int q =3D clamp((int)s->slice_qp, 0, 51); + const u8 *p =3D prob_init[init_type]; + u8 dst[RPI_PROB_ARRAY_SIZE]; + unsigned int i; + + for (i =3D 0; i < RPI_PROB_VALS; i++) { + int init_value =3D p[i]; + int m =3D (init_value >> 4) * 5 - 45; + int n =3D ((init_value & 15) << 3) - 16; + int pre =3D 2 * (((m * q) >> 4) + n) - 127; + + pre ^=3D pre >> 31; + if (pre > 124) + pre =3D 124 + (pre & 1); + dst[i] =3D pre; + } + for (i =3D RPI_PROB_VALS; i !=3D RPI_PROB_ARRAY_SIZE; ++i) + dst[i] =3D 0; + + for (i =3D 0; i < RPI_PROB_ARRAY_SIZE; i +=3D 4) + p1_apb_write(de, 0x1000 + i, + dst[i] + (dst[i + 1] << 8) + (dst[i + 2] << 16) + + (dst[i + 3] << 24)); + + /* + * Having written the prob array back it up + * This is not always needed but is a small overhead that simplifies + * (and speeds up) some multi-tile & WPP scenarios + * There are no scenarios where having written a prob we ever want + * a previous (non-initial) state back + */ + p1_apb_write(de, RPI_TRANSFER, PROB_BACKUP); +} + +#define CMDS_WRITE_SCALING_FACTORS NUM_SCALING_FACTORS +static void write_scaling_factors(struct hevc_d_dec_env *const de) +{ + const u8 *p =3D (u8 *)de->scaling_factors; + int i; + + for (i =3D 0; i < NUM_SCALING_FACTORS; i +=3D 4, p +=3D 4) + p1_apb_write(de, 0x2000 + i, + p[0] + (p[1] << 8) + (p[2] << 16) + (p[3] << 24)); +} + +static inline __u32 dma_to_axi_addr(dma_addr_t a) +{ + return (__u32)(a >> 6); +} + +#define CMDS_WRITE_BITSTREAM 4 +static int write_bitstream(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + /* V4L2 always has emulation prevention bytes in the stream */ + const int rpi_use_emu =3D 1; + const unsigned int len =3D s->data_len; + const dma_addr_t addr =3D s->src_addr + s->sh->data_byte_offset; + const unsigned int offset =3D addr & 63; + + p1_apb_write(de, RPI_BFBASE, dma_to_axi_addr(addr)); + p1_apb_write(de, RPI_BFNUM, len); + p1_apb_write(de, RPI_BFCONTROL, offset + (1 << 7)); /* Stop */ + p1_apb_write(de, RPI_BFCONTROL, offset + (rpi_use_emu << 6)); + return 0; +} + +/* + * The slice constant part of the slice register - width and height need to + * be ORed in later as they are per-tile / WPP-row + */ +static u32 slice_reg_const(const struct hevc_d_dec_state *const s) +{ + u32 x =3D (s->max_num_merge_cand << 0) | + (s->nb_refs[L0] << 4) | + (s->nb_refs[L1] << 8) | + (s->sh->slice_type << 12); + + if (s->sh->flags & V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_SAO_LUMA) + x |=3D BIT(14); + if (s->sh->flags & V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_SAO_CHROMA) + x |=3D BIT(15); + if (s->sh->slice_type =3D=3D HEVC_SLICE_B && + (s->sh->flags & V4L2_HEVC_SLICE_PARAMS_FLAG_MVD_L1_ZERO)) + x |=3D BIT(16); + + return x; +} + +#define CMDS_NEW_SLICE_SEGMENT (4 + CMDS_WRITE_SCALING_FACTORS) + +static void new_slice_segment(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + const struct v4l2_ctrl_hevc_sps *const sps =3D &s->sps; + const struct v4l2_ctrl_hevc_pps *const pps =3D &s->pps; + + p1_apb_write(de, + RPI_SPS0, + ((sps->log2_min_luma_coding_block_size_minus3 + 3) << 0) | + (s->log2_ctb_size << 4) | + ((sps->log2_min_luma_transform_block_size_minus2 + 2) + << 8) | + ((sps->log2_min_luma_transform_block_size_minus2 + 2 + + sps->log2_diff_max_min_luma_transform_block_size) + << 12) | + ((sps->bit_depth_luma_minus8 + 8) << 16) | + ((sps->bit_depth_chroma_minus8 + 8) << 20) | + (sps->max_transform_hierarchy_depth_intra << 24) | + (sps->max_transform_hierarchy_depth_inter << 28)); + + p1_apb_write(de, + RPI_SPS1, + ((sps->pcm_sample_bit_depth_luma_minus1 + 1) << 0) | + ((sps->pcm_sample_bit_depth_chroma_minus1 + 1) << 4) | + ((sps->log2_min_pcm_luma_coding_block_size_minus3 + 3) + << 8) | + ((sps->log2_min_pcm_luma_coding_block_size_minus3 + 3 + + sps->log2_diff_max_min_pcm_luma_coding_block_size) + << 12) | + (((sps->flags & V4L2_HEVC_SPS_FLAG_SEPARATE_COLOUR_PLANE) ? + 0 : sps->chroma_format_idc) << 16) | + ((!!(sps->flags & V4L2_HEVC_SPS_FLAG_AMP_ENABLED)) << 18) | + ((!!(sps->flags & V4L2_HEVC_SPS_FLAG_PCM_ENABLED)) << 19) | + ((!!(sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED)) + << 20) | + ((!!(sps->flags & + V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED)) + << 21)); + + p1_apb_write(de, + RPI_PPS, + ((s->log2_ctb_size - pps->diff_cu_qp_delta_depth) << 0) | + ((!!(pps->flags & V4L2_HEVC_PPS_FLAG_CU_QP_DELTA_ENABLED)) + << 4) | + ((!!(pps->flags & + V4L2_HEVC_PPS_FLAG_TRANSQUANT_BYPASS_ENABLED)) + << 5) | + ((!!(pps->flags & V4L2_HEVC_PPS_FLAG_TRANSFORM_SKIP_ENABLED)) + << 6) | + ((!!(pps->flags & + V4L2_HEVC_PPS_FLAG_SIGN_DATA_HIDING_ENABLED)) + << 7) | + (((pps->pps_cb_qp_offset + s->sh->slice_cb_qp_offset) & 255) + << 8) | + (((pps->pps_cr_qp_offset + s->sh->slice_cr_qp_offset) & 255) + << 16) | + ((!!(pps->flags & + V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED)) + << 24)); + + if (!s->start_ts && + (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED) !=3D 0) + write_scaling_factors(de); + + if (!s->dependent_slice_segment_flag) { + int ctb_col =3D s->sh->slice_segment_addr % + de->pic_width_in_ctbs_y; + int ctb_row =3D s->sh->slice_segment_addr / + de->pic_width_in_ctbs_y; + + de->reg_slicestart =3D (ctb_col << 0) + (ctb_row << 16); + } + + p1_apb_write(de, RPI_SLICESTART, de->reg_slicestart); +} + +/* Slice messages */ + +static void msg_slice(struct hevc_d_dec_env *const de, const u16 msg) +{ + de->slice_msgs[de->num_slice_msgs++] =3D msg; +} + +#define CMDS_PROGRAM_SLICECMDS (1 + SLICE_MSGS_MAX) +static void program_slicecmds(struct hevc_d_dec_env *const de, + const int sliceid) +{ + int i; + + p1_apb_write(de, RPI_SLICECMDS, de->num_slice_msgs + (sliceid << 8)); + + for (i =3D 0; i < de->num_slice_msgs; i++) + p1_apb_write(de, 0x4000 + 4 * i, de->slice_msgs[i] & 0xffff); +} + +/* NoBackwardPredictionFlag 8.3.5 - Simply checks POCs of the frames refer= enced + * by the idx array against cur_poc. Needs to be called twice (with L0 & L= 1) to + * get NoBackwardPredictionFlag. + */ +static int has_backward(const struct v4l2_hevc_dpb_entry *const dpb, + const __u8 *const idx, const unsigned int n, + const s32 cur_poc) +{ + unsigned int i; + + for (i =3D 0; i < n; ++i) { + if (cur_poc < dpb[idx[i]].pic_order_cnt_val) + return 0; + } + return 1; +} + +static void pre_slice_decode(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + const struct v4l2_ctrl_hevc_slice_params *const sh =3D s->sh; + const struct v4l2_ctrl_hevc_decode_params *const dec =3D s->dec; + int weighted_pred_flag, idx; + u16 cmd_slice; + unsigned int collocated_from_l0_flag; + + de->num_slice_msgs =3D 0; + + cmd_slice =3D 0; + if (sh->slice_type =3D=3D HEVC_SLICE_I) + cmd_slice =3D 1; + if (sh->slice_type =3D=3D HEVC_SLICE_P) + cmd_slice =3D 2; + if (sh->slice_type =3D=3D HEVC_SLICE_B) + cmd_slice =3D 3; + + cmd_slice |=3D (s->nb_refs[L0] << 2) | (s->nb_refs[L1] << 6) | + (s->max_num_merge_cand << 11); + + collocated_from_l0_flag =3D + !s->slice_temporal_mvp || + sh->slice_type !=3D HEVC_SLICE_B || + (sh->flags & V4L2_HEVC_SLICE_PARAMS_FLAG_COLLOCATED_FROM_L0); + cmd_slice |=3D collocated_from_l0_flag << 14; + + if (sh->slice_type =3D=3D HEVC_SLICE_P || sh->slice_type =3D=3D HEVC_SLIC= E_B) { + /* Flag to say all reference pictures are from the past */ + const int no_backward_pred_flag =3D + has_backward(dec->dpb, sh->ref_idx_l0, s->nb_refs[L0], + sh->slice_pic_order_cnt) && + has_backward(dec->dpb, sh->ref_idx_l1, s->nb_refs[L1], + sh->slice_pic_order_cnt); + cmd_slice |=3D no_backward_pred_flag << 10; + msg_slice(de, cmd_slice); + + if (s->slice_temporal_mvp) { + const __u8 *const rpl =3D collocated_from_l0_flag ? + sh->ref_idx_l0 : sh->ref_idx_l1; + de->dpbno_col =3D rpl[sh->collocated_ref_idx]; + } + + /* Write reference picture descriptions */ + weighted_pred_flag =3D + sh->slice_type =3D=3D HEVC_SLICE_P ? + !!(s->pps.flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_PRED) : + !!(s->pps.flags & V4L2_HEVC_PPS_FLAG_WEIGHTED_BIPRED); + + for (idx =3D 0; idx < s->nb_refs[L0]; ++idx) { + unsigned int dpb_no =3D sh->ref_idx_l0[idx]; + + msg_slice(de, + dpb_no | + ((dec->dpb[dpb_no].flags & + V4L2_HEVC_DPB_ENTRY_LONG_TERM_REFERENCE) ? + (1 << 4) : 0) | + (weighted_pred_flag ? (3 << 5) : 0)); + msg_slice(de, dec->dpb[dpb_no].pic_order_cnt_val & 0xffff); + + if (weighted_pred_flag) { + const struct v4l2_hevc_pred_weight_table + *const w =3D &sh->pred_weight_table; + const int luma_weight_denom =3D + (1 << w->luma_log2_weight_denom); + const unsigned int chroma_log2_weight_denom =3D + (w->luma_log2_weight_denom + + w->delta_chroma_log2_weight_denom); + const int chroma_weight_denom =3D + (1 << chroma_log2_weight_denom); + + msg_slice(de, + w->luma_log2_weight_denom | + (((w->delta_luma_weight_l0[idx] + + luma_weight_denom) & 0x1ff) + << 3)); + msg_slice(de, w->luma_offset_l0[idx] & 0xff); + msg_slice(de, + chroma_log2_weight_denom | + (((w->delta_chroma_weight_l0[idx][0] + + chroma_weight_denom) & 0x1ff) + << 3)); + msg_slice(de, + w->chroma_offset_l0[idx][0] & 0xff); + msg_slice(de, + chroma_log2_weight_denom | + (((w->delta_chroma_weight_l0[idx][1] + + chroma_weight_denom) & 0x1ff) + << 3)); + msg_slice(de, + w->chroma_offset_l0[idx][1] & 0xff); + } + } + + for (idx =3D 0; idx < s->nb_refs[L1]; ++idx) { + unsigned int dpb_no =3D sh->ref_idx_l1[idx]; + + msg_slice(de, + dpb_no | + ((dec->dpb[dpb_no].flags & + V4L2_HEVC_DPB_ENTRY_LONG_TERM_REFERENCE) ? + (1 << 4) : 0) | + (weighted_pred_flag ? (3 << 5) : 0)); + msg_slice(de, dec->dpb[dpb_no].pic_order_cnt_val & 0xffff); + if (weighted_pred_flag) { + const struct v4l2_hevc_pred_weight_table + *const w =3D &sh->pred_weight_table; + const int luma_weight_denom =3D + (1 << w->luma_log2_weight_denom); + const unsigned int chroma_log2_weight_denom =3D + (w->luma_log2_weight_denom + + w->delta_chroma_log2_weight_denom); + const int chroma_weight_denom =3D + (1 << chroma_log2_weight_denom); + + msg_slice(de, + w->luma_log2_weight_denom | + (((w->delta_luma_weight_l1[idx] + + luma_weight_denom) & 0x1ff) << 3)); + msg_slice(de, w->luma_offset_l1[idx] & 0xff); + msg_slice(de, + chroma_log2_weight_denom | + (((w->delta_chroma_weight_l1[idx][0] + + chroma_weight_denom) & 0x1ff) + << 3)); + msg_slice(de, + w->chroma_offset_l1[idx][0] & 0xff); + msg_slice(de, + chroma_log2_weight_denom | + (((w->delta_chroma_weight_l1[idx][1] + + chroma_weight_denom) & 0x1ff) + << 3)); + msg_slice(de, + w->chroma_offset_l1[idx][1] & 0xff); + } + } + } else { + msg_slice(de, cmd_slice); + } + + msg_slice(de, + (sh->slice_beta_offset_div2 & 15) | + ((sh->slice_tc_offset_div2 & 15) << 4) | + ((sh->flags & + V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_DEBLOCKING_FILTER_DISABLED) ? + 1 << 8 : 0) | + ((sh->flags & + V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_LOOP_FILTER_ACROSS_SLICES_ENABLED) ? + 1 << 9 : 0) | + ((s->pps.flags & + V4L2_HEVC_PPS_FLAG_LOOP_FILTER_ACROSS_TILES_ENABLED) ? + 1 << 10 : 0)); + + msg_slice(de, ((sh->slice_cr_qp_offset & 31) << 5) + + (sh->slice_cb_qp_offset & 31)); /* CMD_QPOFF */ +} + +#define CMDS_WRITE_SLICE 1 +static void write_slice(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s, + const u32 slice_const, + const unsigned int ctb_col, + const unsigned int ctb_row) +{ + const unsigned int cs =3D (1 << s->log2_ctb_size); + const unsigned int w_last =3D s->sps.pic_width_in_luma_samples & (cs - 1); + const unsigned int h_last =3D s->sps.pic_height_in_luma_samples & (cs - 1= ); + + p1_apb_write(de, RPI_SLICE, + slice_const | + ((ctb_col + 1 < s->ctb_width || !w_last ? + cs : w_last) << 17) | + ((ctb_row + 1 < s->ctb_height || !h_last ? + cs : h_last) << 24)); +} + +#define PAUSE_MODE_WPP 1 +#define PAUSE_MODE_TILE 0xffff + +/* + * N.B. This can be called to fill in data from the previous slice so must= not + * use any state data that may change from slice to slice (e.g. qp) + */ +#define CMDS_NEW_ENTRY_POINT (6 + CMDS_WRITE_SLICE) + +static void new_entry_point(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s, + const bool do_bte, + const bool reset_qp_y, + const u32 pause_mode, + const unsigned int tile_x, + const unsigned int tile_y, + const unsigned int ctb_col, + const unsigned int ctb_row, + const unsigned int slice_qp, + const u32 slice_const) +{ + const unsigned int endx =3D s->col_bd[tile_x + 1] - 1; + const unsigned int endy =3D (pause_mode =3D=3D PAUSE_MODE_WPP) ? + ctb_row : s->row_bd[tile_y + 1] - 1; + + p1_apb_write(de, RPI_TILESTART, + s->col_bd[tile_x] | (s->row_bd[tile_y] << 16)); + p1_apb_write(de, RPI_TILEEND, endx | (endy << 16)); + + if (do_bte) + p1_apb_write(de, RPI_BEGINTILEEND, endx | (endy << 16)); + + write_slice(de, s, slice_const, endx, endy); + + if (reset_qp_y) { + unsigned int sps_qp_bd_offset =3D + 6 * s->sps.bit_depth_luma_minus8; + + p1_apb_write(de, RPI_QP, sps_qp_bd_offset + slice_qp); + } + + p1_apb_write(de, RPI_MODE, + pause_mode | + ((endx =3D=3D s->ctb_width - 1) << 17) | + ((endy =3D=3D s->ctb_height - 1) << 18)); + + p1_apb_write(de, RPI_CONTROL, (ctb_col << 0) | (ctb_row << 16)); + + de->entry_tile_x =3D tile_x; + de->entry_tile_y =3D tile_y; + de->entry_ctb_x =3D ctb_col; + de->entry_ctb_y =3D ctb_row; + de->entry_qp =3D slice_qp; + de->entry_slice =3D slice_const; +} + +/* Wavefront mode */ + +#define CMDS_WPP_PAUSE 4 +static void wpp_pause(struct hevc_d_dec_env *const de, int ctb_row) +{ + p1_apb_write(de, RPI_STATUS, (ctb_row << 18) | 0x25); + p1_apb_write(de, RPI_TRANSFER, PROB_BACKUP); + p1_apb_write(de, RPI_MODE, + ctb_row =3D=3D de->pic_height_in_ctbs_y - 1 ? + 0x70000 : 0x30000); + p1_apb_write(de, RPI_CONTROL, (ctb_row << 16) + 2); +} + +#define CMDS_WPP_ENTRY_FILL_1 (CMDS_WPP_PAUSE + 2 + CMDS_NEW_ENTRY_POINT) +static int wpp_entry_fill(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s, + const unsigned int last_y) +{ + int rv; + const unsigned int last_x =3D s->ctb_width - 1; + + rv =3D cmds_check_space(de, CMDS_WPP_ENTRY_FILL_1 * + (last_y - de->entry_ctb_y)); + if (rv) + return rv; + + while (de->entry_ctb_y < last_y) { + /* wpp_entry_x/y set by wpp_entry_point */ + if (s->ctb_width > 2) + wpp_pause(de, de->entry_ctb_y); + p1_apb_write(de, RPI_STATUS, + (de->entry_ctb_y << 18) | (last_x << 5) | 2); + + /* if width =3D=3D 1 then the saved state is the init one */ + if (s->ctb_width =3D=3D 2) + p1_apb_write(de, RPI_TRANSFER, PROB_BACKUP); + else + p1_apb_write(de, RPI_TRANSFER, PROB_RELOAD); + + new_entry_point(de, s, false, true, PAUSE_MODE_WPP, + 0, 0, 0, de->entry_ctb_y + 1, + de->entry_qp, de->entry_slice); + } + return 0; +} + +static int wpp_end_previous_slice(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + int rv; + + rv =3D wpp_entry_fill(de, s, s->prev_ctb_y); + if (rv) + return rv; + + rv =3D cmds_check_space(de, CMDS_WPP_PAUSE + 2); + if (rv) + return rv; + + if (de->entry_ctb_x < 2 && + (de->entry_ctb_y < s->start_ctb_y || s->start_ctb_x > 2) && + s->ctb_width > 2) + wpp_pause(de, s->prev_ctb_y); + p1_apb_write(de, RPI_STATUS, + 1 | (s->prev_ctb_x << 5) | (s->prev_ctb_y << 18)); + if (s->start_ctb_x =3D=3D 2 || + (s->ctb_width =3D=3D 2 && de->entry_ctb_y < s->start_ctb_y)) + p1_apb_write(de, RPI_TRANSFER, PROB_BACKUP); + return 0; +} + +/* + * Only main profile supported so WPP =3D> !Tiles which makes some of the + * next chunk code simpler + */ +static int wpp_decode_slice(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s, + bool last_slice) +{ + bool reset_qp_y =3D true; + const bool indep =3D !s->dependent_slice_segment_flag; + int rv; + + if (s->start_ts) { + rv =3D wpp_end_previous_slice(de, s); + if (rv) + return rv; + } + pre_slice_decode(de, s); + + rv =3D cmds_check_space(de, + CMDS_WRITE_BITSTREAM + + CMDS_WRITE_PROB + + CMDS_PROGRAM_SLICECMDS + + CMDS_NEW_SLICE_SEGMENT + + CMDS_NEW_ENTRY_POINT); + if (rv) + return rv; + + rv =3D write_bitstream(de, s); + if (rv) + return rv; + + if (!s->start_ts || indep || s->ctb_width =3D=3D 1) + write_prob(de, s); + else if (!s->start_ctb_x) + p1_apb_write(de, RPI_TRANSFER, PROB_RELOAD); + else + reset_qp_y =3D false; + + program_slicecmds(de, s->slice_idx); + new_slice_segment(de, s); + new_entry_point(de, s, indep, reset_qp_y, PAUSE_MODE_WPP, + 0, 0, s->start_ctb_x, s->start_ctb_y, + s->slice_qp, slice_reg_const(s)); + + if (last_slice) { + rv =3D wpp_entry_fill(de, s, s->ctb_height - 1); + if (rv) + return rv; + + rv =3D cmds_check_space(de, CMDS_WPP_PAUSE + 1); + if (rv) + return rv; + + if (de->entry_ctb_x < 2 && s->ctb_width > 2) + wpp_pause(de, s->ctb_height - 1); + + p1_apb_write(de, RPI_STATUS, + 1 | ((s->ctb_width - 1) << 5) | + ((s->ctb_height - 1) << 18)); + } + return 0; +} + +/* Tiles mode */ + +/* Guarantees 1 cmd entry free on exit */ +static int tile_entry_fill(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s, + const unsigned int last_tile_x, + const unsigned int last_tile_y) +{ + while (de->entry_tile_y < last_tile_y || + (de->entry_tile_y =3D=3D last_tile_y && + de->entry_tile_x < last_tile_x)) { + int rv; + unsigned int t_x =3D de->entry_tile_x; + unsigned int t_y =3D de->entry_tile_y; + const unsigned int last_x =3D s->col_bd[t_x + 1] - 1; + const unsigned int last_y =3D s->row_bd[t_y + 1] - 1; + + /* One more than needed here */ + rv =3D cmds_check_space(de, CMDS_NEW_ENTRY_POINT + 3); + if (rv) + return rv; + + p1_apb_write(de, RPI_STATUS, + 2 | (last_x << 5) | (last_y << 18)); + p1_apb_write(de, RPI_TRANSFER, PROB_RELOAD); + + /* Inc tile */ + if (++t_x >=3D s->tile_width) { + t_x =3D 0; + ++t_y; + } + + new_entry_point(de, s, false, true, PAUSE_MODE_TILE, + t_x, t_y, s->col_bd[t_x], s->row_bd[t_y], + de->entry_qp, de->entry_slice); + } + return 0; +} + +/* Write STATUS register with expected end CTU address of previous slice */ +static int end_previous_slice(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + int rv; + + rv =3D tile_entry_fill(de, s, + ctb_to_tile_x(s, s->prev_ctb_x), + ctb_to_tile_y(s, s->prev_ctb_y)); + if (rv) + return rv; + + p1_apb_write(de, RPI_STATUS, + 1 | (s->prev_ctb_x << 5) | (s->prev_ctb_y << 18)); + return 0; +} + +static int decode_slice(struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s, + bool last_slice) +{ + bool reset_qp_y; + unsigned int tile_x =3D ctb_to_tile_x(s, s->start_ctb_x); + unsigned int tile_y =3D ctb_to_tile_y(s, s->start_ctb_y); + int rv; + + if (s->start_ts) { + rv =3D end_previous_slice(de, s); + if (rv) + return rv; + } + + rv =3D cmds_check_space(de, + CMDS_WRITE_BITSTREAM + + CMDS_WRITE_PROB + + CMDS_PROGRAM_SLICECMDS + + CMDS_NEW_SLICE_SEGMENT + + CMDS_NEW_ENTRY_POINT); + if (rv) + return rv; + + pre_slice_decode(de, s); + rv =3D write_bitstream(de, s); + if (rv) + return rv; + + reset_qp_y =3D !s->start_ts || + !s->dependent_slice_segment_flag || + tile_x !=3D ctb_to_tile_x(s, s->prev_ctb_x) || + tile_y !=3D ctb_to_tile_y(s, s->prev_ctb_y); + if (reset_qp_y) + write_prob(de, s); + + program_slicecmds(de, s->slice_idx); + new_slice_segment(de, s); + new_entry_point(de, s, !s->dependent_slice_segment_flag, reset_qp_y, + PAUSE_MODE_TILE, + tile_x, tile_y, s->start_ctb_x, s->start_ctb_y, + s->slice_qp, slice_reg_const(s)); + + /* + * If this is the last slice then fill in the other tile entries + * now, otherwise this will be done at the start of the next slice + * when it will be known where this slice finishes + */ + if (last_slice) { + rv =3D tile_entry_fill(de, s, + s->tile_width - 1, + s->tile_height - 1); + if (rv) + return rv; + p1_apb_write(de, RPI_STATUS, + 1 | ((s->ctb_width - 1) << 5) | + ((s->ctb_height - 1) << 18)); + } + return 0; +} + +/* Scaling factors */ + +static void expand_scaling_list(const unsigned int size_id, + u8 *const dst0, + const u8 *const src0, uint8_t dc) +{ + u8 *d; + unsigned int x, y; + + switch (size_id) { + case 0: + memcpy(dst0, src0, 16); + break; + case 1: + memcpy(dst0, src0, 64); + break; + case 2: + d =3D dst0; + + for (y =3D 0; y !=3D 16; y++) { + const u8 *s =3D src0 + (y >> 1) * 8; + + for (x =3D 0; x !=3D 8; ++x) { + *d++ =3D *s; + *d++ =3D *s++; + } + } + dst0[0] =3D dc; + break; + default: + d =3D dst0; + + for (y =3D 0; y !=3D 32; y++) { + const u8 *s =3D src0 + (y >> 2) * 8; + + for (x =3D 0; x !=3D 8; ++x) { + *d++ =3D *s; + *d++ =3D *s; + *d++ =3D *s; + *d++ =3D *s++; + } + } + dst0[0] =3D dc; + break; + } +} + +static void populate_scaling_factors(const struct hevc_d_run *const run, + struct hevc_d_dec_env *const de, + const struct hevc_d_dec_state *const s) +{ + const struct v4l2_ctrl_hevc_scaling_matrix *const sl =3D + run->h265.scaling_matrix; + /* Array of constants for scaling factors */ + static const u32 scaling_factor_offsets[4][6] =3D { + /* + * MID0 MID1 MID2 MID3 MID4 MID5 + */ + /* SID0 (4x4) */ + { 0x0000, 0x0010, 0x0020, 0x0030, 0x0040, 0x0050 }, + /* SID1 (8x8) */ + { 0x0060, 0x00A0, 0x00E0, 0x0120, 0x0160, 0x01A0 }, + /* SID2 (16x16) */ + { 0x01E0, 0x02E0, 0x03E0, 0x04E0, 0x05E0, 0x06E0 }, + /* SID3 (32x32) */ + { 0x07E0, 0x0BE0, 0x0000, 0x0000, 0x0000, 0x0000 } + }; + unsigned int mid; + + for (mid =3D 0; mid < 6; mid++) + expand_scaling_list(0, de->scaling_factors + + scaling_factor_offsets[0][mid], + sl->scaling_list_4x4[mid], 0); + for (mid =3D 0; mid < 6; mid++) + expand_scaling_list(1, de->scaling_factors + + scaling_factor_offsets[1][mid], + sl->scaling_list_8x8[mid], 0); + for (mid =3D 0; mid < 6; mid++) + expand_scaling_list(2, de->scaling_factors + + scaling_factor_offsets[2][mid], + sl->scaling_list_16x16[mid], + sl->scaling_list_dc_coef_16x16[mid]); + for (mid =3D 0; mid < 2; mid++) + expand_scaling_list(3, de->scaling_factors + + scaling_factor_offsets[3][mid], + sl->scaling_list_32x32[mid], + sl->scaling_list_dc_coef_32x32[mid]); +} + +static void free_ps_info(struct hevc_d_dec_state *const s) +{ + kfree(s->ctb_addr_rs_to_ts); + s->ctb_addr_rs_to_ts =3D NULL; + kfree(s->ctb_addr_ts_to_rs); + s->ctb_addr_ts_to_rs =3D NULL; + + kfree(s->col_bd); + s->col_bd =3D NULL; + kfree(s->row_bd); + s->row_bd =3D NULL; +} + +static unsigned int tile_width(const struct hevc_d_dec_state *const s, + const unsigned int t_x) +{ + return s->col_bd[t_x + 1] - s->col_bd[t_x]; +} + +static unsigned int tile_height(const struct hevc_d_dec_state *const s, + const unsigned int t_y) +{ + return s->row_bd[t_y + 1] - s->row_bd[t_y]; +} + +static void fill_rs_to_ts(struct hevc_d_dec_state *const s) +{ + unsigned int ts =3D 0; + unsigned int t_y; + unsigned int tr_rs =3D 0; + + for (t_y =3D 0; t_y !=3D s->tile_height; ++t_y) { + const unsigned int t_h =3D tile_height(s, t_y); + unsigned int t_x; + unsigned int tc_rs =3D tr_rs; + + for (t_x =3D 0; t_x !=3D s->tile_width; ++t_x) { + const unsigned int t_w =3D tile_width(s, t_x); + unsigned int y; + unsigned int rs =3D tc_rs; + + for (y =3D 0; y !=3D t_h; ++y) { + unsigned int x; + + for (x =3D 0; x !=3D t_w; ++x) { + s->ctb_addr_rs_to_ts[rs + x] =3D ts; + s->ctb_addr_ts_to_rs[ts] =3D rs + x; + ++ts; + } + rs +=3D s->ctb_width; + } + tc_rs +=3D t_w; + } + tr_rs +=3D t_h * s->ctb_width; + } +} + +static int updated_ps(struct hevc_d_dec_state *const s) +{ + unsigned int i; + + free_ps_info(s); + + /* Inferred parameters */ + s->log2_ctb_size =3D s->sps.log2_min_luma_coding_block_size_minus3 + 3 + + s->sps.log2_diff_max_min_luma_coding_block_size; + + s->ctb_width =3D (s->sps.pic_width_in_luma_samples + + (1 << s->log2_ctb_size) - 1) >> + s->log2_ctb_size; + s->ctb_height =3D (s->sps.pic_height_in_luma_samples + + (1 << s->log2_ctb_size) - 1) >> + s->log2_ctb_size; + s->ctb_size =3D s->ctb_width * s->ctb_height; + + s->ctb_addr_rs_to_ts =3D kmalloc_array(s->ctb_size, + sizeof(*s->ctb_addr_rs_to_ts), + GFP_KERNEL); + if (!s->ctb_addr_rs_to_ts) + goto fail; + s->ctb_addr_ts_to_rs =3D kmalloc_array(s->ctb_size, + sizeof(*s->ctb_addr_ts_to_rs), + GFP_KERNEL); + if (!s->ctb_addr_ts_to_rs) + goto fail; + + if (!(s->pps.flags & V4L2_HEVC_PPS_FLAG_TILES_ENABLED)) { + s->tile_width =3D 1; + s->tile_height =3D 1; + } else { + s->tile_width =3D s->pps.num_tile_columns_minus1 + 1; + s->tile_height =3D s->pps.num_tile_rows_minus1 + 1; + } + + s->col_bd =3D kmalloc_array((s->tile_width + 1), sizeof(*s->col_bd), + GFP_KERNEL); + if (!s->col_bd) + goto fail; + s->row_bd =3D kmalloc_array((s->tile_height + 1), sizeof(*s->row_bd), + GFP_KERNEL); + if (!s->row_bd) + goto fail; + + s->col_bd[0] =3D 0; + for (i =3D 1; i < s->tile_width; i++) + s->col_bd[i] =3D s->col_bd[i - 1] + + s->pps.column_width_minus1[i - 1] + 1; + s->col_bd[s->tile_width] =3D s->ctb_width; + + s->row_bd[0] =3D 0; + for (i =3D 1; i < s->tile_height; i++) + s->row_bd[i] =3D s->row_bd[i - 1] + + s->pps.row_height_minus1[i - 1] + 1; + s->row_bd[s->tile_height] =3D s->ctb_height; + + fill_rs_to_ts(s); + return 0; + +fail: + free_ps_info(s); + /* Set invalid to force reload */ + s->sps.pic_width_in_luma_samples =3D 0; + return -ENOMEM; +} + +static void setup_colmv(struct hevc_d_ctx *const ctx, struct hevc_d_run *r= un, + struct hevc_d_dec_state *const s) +{ + ctx->colmv_stride =3D ALIGN(s->sps.pic_width_in_luma_samples, 64); + ctx->colmv_picsize =3D ctx->colmv_stride * + (ALIGN(s->sps.pic_height_in_luma_samples, 64) >> 4); +} + +static struct hevc_d_dec_env *dec_env_new(struct hevc_d_ctx *const ctx) +{ + struct hevc_d_dec_env *de; + unsigned long lock_flags; + + spin_lock_irqsave(&ctx->dec_lock, lock_flags); + + de =3D ctx->dec_free; + if (de) { + ctx->dec_free =3D de->next; + de->next =3D NULL; + de->state =3D HEVC_D_DECODE_SLICE_START; + } + + spin_unlock_irqrestore(&ctx->dec_lock, lock_flags); + return de; +} + +/* Can be called from irq context */ +static void dec_env_delete(struct hevc_d_dec_env *const de) +{ + struct hevc_d_ctx * const ctx =3D de->ctx; + unsigned long lock_flags; + + aux_q_release(ctx, &de->frame_aux); + aux_q_release(ctx, &de->col_aux); + + spin_lock_irqsave(&ctx->dec_lock, lock_flags); + + de->state =3D HEVC_D_DECODE_END; + de->next =3D ctx->dec_free; + ctx->dec_free =3D de; + + spin_unlock_irqrestore(&ctx->dec_lock, lock_flags); +} + +static void dec_env_uninit(struct hevc_d_ctx *const ctx) +{ + unsigned int i; + + if (ctx->dec_pool) { + for (i =3D 0; i !=3D HEVC_D_DEC_ENV_COUNT; ++i) + hwbuf_free(ctx->dev, &ctx->dec_pool[i].cmd); + + kfree(ctx->dec_pool); + } + + ctx->dec_pool =3D NULL; + ctx->dec_free =3D NULL; +} + +static int dec_env_init(struct hevc_d_ctx *const ctx) +{ + unsigned int i; + + ctx->dec_pool =3D kmalloc_objs(*ctx->dec_pool, HEVC_D_DEC_ENV_COUNT, + GFP_KERNEL); + if (!ctx->dec_pool) + return -1; + + spin_lock_init(&ctx->dec_lock); + + ctx->dec_free =3D ctx->dec_pool; + for (i =3D 0; i !=3D HEVC_D_DEC_ENV_COUNT - 1; ++i) + ctx->dec_pool[i].next =3D ctx->dec_pool + i + 1; + + for (i =3D 0; i !=3D HEVC_D_DEC_ENV_COUNT; ++i) { + struct hevc_d_dec_env *const de =3D ctx->dec_pool + i; + + de->ctx =3D ctx; + de->decode_order =3D i; + de->cmd_max =3D CMD_BUFFER_SIZE_INIT; + if (hwbuf_alloc(ctx->dev, &de->cmd, + de->cmd_max * sizeof(u64), 0)) + goto fail; + } + + return 0; + +fail: + dec_env_uninit(ctx); + return -1; +} + +/* + * Assume that we get exactly the same DPB for every slice it makes no real + * sense otherwise. + */ +#if V4L2_HEVC_DPB_ENTRIES_NUM_MAX > 16 +#error HEVC_DPB_ENTRIES > h/w slots +#endif + +static u32 mk_config2(const struct hevc_d_dec_state *const s) +{ + const struct v4l2_ctrl_hevc_sps *const sps =3D &s->sps; + const struct v4l2_ctrl_hevc_pps *const pps =3D &s->pps; + u32 c; + + c =3D (sps->bit_depth_luma_minus8 + 8) << 0; /* BitDepthY */ + c |=3D (sps->bit_depth_chroma_minus8 + 8) << 4; /* BitDepthC */ + if (sps->bit_depth_luma_minus8) /* BitDepthY */ + c |=3D BIT(8); + if (sps->bit_depth_chroma_minus8) /* BitDepthC */ + c |=3D BIT(9); + c |=3D s->log2_ctb_size << 10; + if (pps->flags & V4L2_HEVC_PPS_FLAG_CONSTRAINED_INTRA_PRED) + c |=3D BIT(13); + if (sps->flags & V4L2_HEVC_SPS_FLAG_STRONG_INTRA_SMOOTHING_ENABLED) + c |=3D BIT(14); + if (s->mk_aux) + c |=3D BIT(15); /* Write motion vectors to external memory */ + c |=3D (pps->log2_parallel_merge_level_minus2 + 2) << 16; + if (s->slice_temporal_mvp) + c |=3D BIT(19); + if (sps->flags & V4L2_HEVC_SPS_FLAG_PCM_LOOP_FILTER_DISABLED) + c |=3D BIT(20); + c |=3D (pps->pps_cb_qp_offset & 31) << 21; + c |=3D (pps->pps_cr_qp_offset & 31) << 26; + return c; +} + +static inline bool is_ref_unit_type(const unsigned int nal_unit_type) +{ + /* From Table 7-1 + * True for 1, 3, 5, 7, 9, 11, 13, 15 + */ + return (nal_unit_type & ~0xe) !=3D 0; +} + +static int hevc_d_h265_setup(struct hevc_d_ctx *ctx, struct hevc_d_run *ru= n) +{ + struct hevc_d_dev *const dev =3D ctx->dev; + const struct v4l2_ctrl_hevc_decode_params *const dec =3D + run->h265.dec; + /* sh0 used where slice header contents should be constant over all + * slices, or first slice of frame + */ + const struct v4l2_ctrl_hevc_slice_params *const sh0 =3D + run->h265.slice_params; + struct hevc_d_q_aux *dpb_q_aux[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]; + struct hevc_d_dec_state *const s =3D ctx->state; + struct vb2_queue *vq; + struct hevc_d_dec_env *de =3D ctx->dec0; + unsigned int prev_rs; + unsigned int i; + int rv; + bool slice_temporal_mvp; + unsigned int ctb_size_y; + bool sps_changed =3D false; + + de =3D dec_env_new(ctx); + if (!de) { + v4l2_err(&dev->v4l2_dev, "Failed to find free decode env\n"); + return -1; + } + ctx->dec0 =3D de; + + s->sh =3D NULL; /* Avoid use until in the slice loop */ + + slice_temporal_mvp =3D (sh0->flags & + V4L2_HEVC_SLICE_PARAMS_FLAG_SLICE_TEMPORAL_MVP_ENABLED); + + /* Frame start */ + + if (!is_sps_set(run->h265.sps)) { + v4l2_warn(&dev->v4l2_dev, "SPS never set\n"); + goto fail; + } + /* Can't check for PPS easily as all 0's looks valid */ + + if (memcmp(&s->sps, run->h265.sps, sizeof(s->sps)) !=3D 0) { + /* SPS changed */ + memcpy(&s->sps, run->h265.sps, sizeof(s->sps)); + sps_changed =3D true; + } + if (sps_changed || + memcmp(&s->pps, run->h265.pps, sizeof(s->pps)) !=3D 0) { + /* SPS changed */ + memcpy(&s->pps, run->h265.pps, sizeof(s->pps)); + + /* Recalc stuff as required */ + rv =3D updated_ps(s); + if (rv) + goto fail; + } + + ctb_size_y =3D + 1U << (s->sps.log2_min_luma_coding_block_size_minus3 + + 3 + s->sps.log2_diff_max_min_luma_coding_block_size); + + de->pic_width_in_ctbs_y =3D + (s->sps.pic_width_in_luma_samples + ctb_size_y - 1) / + ctb_size_y; /* 7-15 */ + de->pic_height_in_ctbs_y =3D + (s->sps.pic_height_in_luma_samples + ctb_size_y - 1) / + ctb_size_y; /* 7-17 */ + de->cmd_len =3D 0; + de->dpbno_col =3D ~0U; + + de->luma_stride =3D ctx->dst_fmt.height * 128; + de->frame_luma_addr =3D + vb2_dma_contig_plane_dma_addr(&run->dst->vb2_buf, 0); + de->chroma_stride =3D de->luma_stride / 2; + de->frame_chroma_addr =3D + vb2_dma_contig_plane_dma_addr(&run->dst->vb2_buf, 1); + de->frame_aux =3D NULL; + + if (s->sps.bit_depth_luma_minus8 =3D=3D 0) { + if (ctx->dst_fmt.pixelformat !=3D V4L2_PIX_FMT_NV12MT_COL128) { + v4l2_err(&dev->v4l2_dev, + "Pixel format %#x !=3D NV12MT_COL128 for 8-bit output", + ctx->dst_fmt.pixelformat); + goto fail; + } + } else { + if (ctx->dst_fmt.pixelformat !=3D + V4L2_PIX_FMT_NV12MT_10_COL128) { + v4l2_err(&dev->v4l2_dev, + "Pixel format %#x !=3D NV12MT_10_COL128 for 10-bit output", + ctx->dst_fmt.pixelformat); + goto fail; + } + } + + if (s->sps.pic_width_in_luma_samples > ctx->dst_fmt.width || + s->sps.pic_height_in_luma_samples > ctx->dst_fmt.height) { + v4l2_warn(&dev->v4l2_dev, + "SPS size (%dx%d) > capture size (%d,%d)\n", + s->sps.pic_width_in_luma_samples, + s->sps.pic_height_in_luma_samples, + ctx->dst_fmt.width, + ctx->dst_fmt.height); + goto fail; + } + + /* + * Fill in ref planes with our address s.t. if we mess up refs + * somehow then we still have a valid address entry + */ + for (i =3D 0; i !=3D 16; ++i) { + de->ref_addrs[i][0] =3D de->frame_luma_addr; + de->ref_addrs[i][1] =3D de->frame_chroma_addr; + } + + /* + * Stash initial temporal_mvp flag + * This must be the same for all pic slices (7.4.7.1) + */ + s->slice_temporal_mvp =3D slice_temporal_mvp; + + /* + * Need Aux ents for all (ref) DPB ents if temporal MV could + * be enabled for any pic + */ + s->use_aux =3D ((s->sps.flags & + V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) !=3D 0); + s->mk_aux =3D s->use_aux && + (s->sps.sps_max_sub_layers_minus1 >=3D sh0->nuh_temporal_id_plus1 || + is_ref_unit_type(sh0->nal_unit_type)); + + /* Phase 2 reg pre-calc */ + de->rpi_config2 =3D mk_config2(s); + de->rpi_framesize =3D (s->sps.pic_height_in_luma_samples << 16) | + s->sps.pic_width_in_luma_samples; + de->rpi_currpoc =3D sh0->slice_pic_order_cnt; + + if (s->sps.flags & + V4L2_HEVC_SPS_FLAG_SPS_TEMPORAL_MVP_ENABLED) { + setup_colmv(ctx, run, s); + } + + s->slice_idx =3D 0; + + if (sh0->slice_segment_addr !=3D 0) { + v4l2_warn(&dev->v4l2_dev, + "New frame but segment_addr=3D%d\n", + sh0->slice_segment_addr); + goto fail; + } + + /* Either map src buffer or use directly */ + s->src_addr =3D 0; + + s->src_addr =3D vb2_dma_contig_plane_dma_addr(&run->src->vb2_buf, 0); + if (!s->src_addr) { + v4l2_err(&dev->v4l2_dev, "Failed to map src buffer\n"); + goto fail; + } + + /* Pre calc parameters */ + s->dec =3D dec; + s->idx_inuse =3D 0; + for (i =3D 0; i !=3D run->h265.slice_ents; ++i) { + const struct v4l2_ctrl_hevc_slice_params *const sh =3D sh0 + i; + const bool last_slice =3D i + 1 =3D=3D run->h265.slice_ents; + const u32 byte_size =3D DIV_ROUND_UP(sh->bit_size, 8); + unsigned int j; + + s->sh =3D sh; + + if (sh->data_byte_offset + byte_size > run->src->planes[0].bytesused) { + v4l2_warn(&dev->v4l2_dev, + "data_byte_offset %d + bits %d (=3D %d bytes) > bytesused %d\n", + sh->data_byte_offset, sh->bit_size, byte_size, + run->src->planes[0].bytesused); + goto fail; + } + /* BFNUM (data_len) includes the byte with rbsp_stop_one_bit which is not + * part of slice_segment_data but is all but certain to be in the input + * stream so add that when calulating the value we need, but limit to the + * actual size of the buffer (which may well be what is used to set + * bit_size if the caller isn't being very pedantic). + */ + s->data_len =3D min(sh->bit_size / 8 + 1, + run->src->planes[0].bytesused - sh->data_byte_offset); + + s->slice_qp =3D 26 + s->pps.init_qp_minus26 + sh->slice_qp_delta; + s->max_num_merge_cand =3D sh->slice_type =3D=3D HEVC_SLICE_I ? + 0 : + (5 - sh->five_minus_max_num_merge_cand); + s->dependent_slice_segment_flag =3D + ((sh->flags & + V4L2_HEVC_SLICE_PARAMS_FLAG_DEPENDENT_SLICE_SEGMENT) !=3D 0); + + s->nb_refs[0] =3D (sh->slice_type =3D=3D HEVC_SLICE_I) ? + 0 : + sh->num_ref_idx_l0_active_minus1 + 1; + s->nb_refs[1] =3D (sh->slice_type !=3D HEVC_SLICE_B) ? + 0 : + sh->num_ref_idx_l1_active_minus1 + 1; + + for (j =3D 0; j !=3D s->nb_refs[0]; ++j) + s->idx_inuse |=3D 1 << sh->ref_idx_l0[j]; + for (j =3D 0; j !=3D s->nb_refs[1]; ++j) + s->idx_inuse |=3D 1 << sh->ref_idx_l1[j]; + + if (s->sps.flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED) + populate_scaling_factors(run, de, s); + + /* Calc all the random coord info to avoid repeated conversion in/out */ + s->start_ts =3D s->ctb_addr_rs_to_ts[sh->slice_segment_addr]; + s->start_ctb_x =3D sh->slice_segment_addr % de->pic_width_in_ctbs_y; + s->start_ctb_y =3D sh->slice_segment_addr / de->pic_width_in_ctbs_y; + /* Last CTB of previous slice */ + prev_rs =3D !s->start_ts ? 0 : s->ctb_addr_ts_to_rs[s->start_ts - 1]; + s->prev_ctb_x =3D prev_rs % de->pic_width_in_ctbs_y; + s->prev_ctb_y =3D prev_rs / de->pic_width_in_ctbs_y; + + if ((s->pps.flags & V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED)) + rv =3D wpp_decode_slice(de, s, last_slice); + else + rv =3D decode_slice(de, s, last_slice); + if (rv) + goto fail; + + ++s->slice_idx; + } + + /* Frame end */ + memset(dpb_q_aux, 0, + sizeof(*dpb_q_aux) * V4L2_HEVC_DPB_ENTRIES_NUM_MAX); + + /* + * Locate ref frames + * At least in the current implementation this is constant across all + * slices. If this changes we will need idx mapping code. + */ + + vq =3D v4l2_m2m_get_vq(ctx->fh.m2m_ctx, + V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE); + + if (!vq) { + v4l2_err(&dev->v4l2_dev, "VQ gone!\n"); + goto fail; + } + + for (i =3D 0; i < dec->num_active_dpb_entries; ++i) { + struct vb2_buffer *buf =3D vb2_find_buffer(vq, dec->dpb[i].timestamp); + + if (!buf) { + if (!(s->idx_inuse & (1 << i))) + hevc_d_dbg(2, &dev->v4l2_dev, + "Missing unused DPB ent %d, timestamp=3D%lld\n", + i, (long long)dec->dpb[i].timestamp); + else + v4l2_warn(&dev->v4l2_dev, + "Missing inuse DPB ent %d, timestamp=3D%lld\n", + i, (long long)dec->dpb[i].timestamp); + continue; + } + + if (s->use_aux) { + int buffer_index =3D buf->index; + + dpb_q_aux[i] =3D aux_q_ref_idx(ctx, buffer_index); + if (!dpb_q_aux[i]) + v4l2_warn(&dev->v4l2_dev, + "Missing DPB AUX ent %d, timestamp=3D%lld, index=3D%d\n", + i, (long long)dec->dpb[i].timestamp, + buffer_index); + } + + de->ref_addrs[i][0] =3D + vb2_dma_contig_plane_dma_addr(buf, 0); + de->ref_addrs[i][1] =3D + vb2_dma_contig_plane_dma_addr(buf, 1); + } + + /* Move DPB from temp */ + for (i =3D 0; i !=3D V4L2_HEVC_DPB_ENTRIES_NUM_MAX; ++i) { + aux_q_release(ctx, &s->ref_aux[i]); + s->ref_aux[i] =3D dpb_q_aux[i]; + } + + /* Unref the old frame aux too - it is either in the DPB or not now */ + aux_q_release(ctx, &s->frame_aux); + + if (s->mk_aux) { + s->frame_aux =3D aux_q_new(ctx, run->dst->vb2_buf.index); + + if (!s->frame_aux) { + v4l2_err(&dev->v4l2_dev, + "Failed to obtain aux storage for frame\n"); + goto fail; + } + + de->frame_aux =3D aux_q_ref(ctx, s->frame_aux); + } + + if (de->dpbno_col !=3D ~0U) { + if (de->dpbno_col >=3D dec->num_active_dpb_entries) { + v4l2_err(&dev->v4l2_dev, + "Col ref index %d >=3D %d\n", + de->dpbno_col, + dec->num_active_dpb_entries); + } else { + /* Standard requires that the col pic is constant for + * the duration of the pic (text of collocated_ref_idx + * in H265-2 2018 7.4.7.1) + */ + + /* Spot the collocated ref in passing */ + de->col_aux =3D aux_q_ref(ctx, + dpb_q_aux[de->dpbno_col]); + + if (!de->col_aux) { + v4l2_warn(&dev->v4l2_dev, + "Missing DPB ent for col\n"); + /* Need to abort if this fails as P2 may + * explode on bad data + */ + goto fail; + } + } + } + + de->state =3D HEVC_D_DECODE_PHASE1; + return 0; + +fail: + /* Actual error reporting happens in Trigger */ + de->state =3D HEVC_D_DECODE_ERROR_DONE; + return 0; +} + +/* Handle PU and COEFF stream overflow + * + * Returns: + * -1 Phase 1 decode error + * 0 OK + * >0 Out of space (bitmask) + */ + +#define STATUS_COEFF_EXHAUSTED 8 +#define STATUS_PU_EXHAUSTED 16 + +static int check_status(const struct hevc_d_dev *const dev) +{ + const u32 cfstatus =3D apb_read(dev, RPI_CFSTATUS); + const u32 cfnum =3D apb_read(dev, RPI_CFNUM); + u32 status =3D apb_read(dev, RPI_STATUS); + + /* + * Handle PU and COEFF stream overflow + * This is the definition of successful completion of phase 1. + * It assures that status register is zero and all blocks in each tile + * have completed + */ + if (cfstatus =3D=3D cfnum) + return 0; + + status &=3D (STATUS_PU_EXHAUSTED | STATUS_COEFF_EXHAUSTED); + if (status) + return status; + + return -1; +} + +static void phase2_done(struct hevc_d_dev *const dev, + struct hevc_d_dec_env *const de, + enum vb2_buffer_state state) +{ + v4l2_m2m_buf_done(de->frame_buf, state); + de->frame_buf =3D NULL; + + media_request_manual_complete(de->request); + de->request =3D NULL; + + dec_env_delete(de); + + /* Finally allow new P1. Avoids possibility of race with de alloc */ + hevc_d_hw_irq_active1_enable_claim(dev, 1); +} + +static void phase2_cb(struct hevc_d_dev *const dev, void *v) +{ + phase2_done(dev, v, VB2_BUF_STATE_DONE); +} + +static void phase2_claimed(struct hevc_d_dev *const dev, void *v) +{ + struct hevc_d_dec_env *const de =3D v; + unsigned int i; + + apb_write_vc_addr(dev, RPI_PURBASE, de->pu_base_vc); + apb_write_vc_len(dev, RPI_PURSTRIDE, de->pu_stride); + apb_write_vc_addr(dev, RPI_COEFFRBASE, de->coeff_base_vc); + apb_write_vc_len(dev, RPI_COEFFRSTRIDE, de->coeff_stride); + + apb_write_vc_addr(dev, RPI_OUTYBASE, de->frame_luma_addr); + apb_write_vc_addr(dev, RPI_OUTCBASE, de->frame_chroma_addr); + apb_write_vc_len(dev, RPI_OUTYSTRIDE, de->luma_stride); + apb_write_vc_len(dev, RPI_OUTCSTRIDE, de->chroma_stride); + + for (i =3D 0; i < 16; i++) { + /* Strides are in fact unused but fill in anyway */ + unsigned int roff =3D i * RPI_REFREGS_SIZE; + + apb_write_vc_addr(dev, RPI_REFYBASE0 + roff, + de->ref_addrs[i][0]); + apb_write_vc_len(dev, RPI_REFYSTRIDE0 + roff, + de->luma_stride); + apb_write_vc_addr(dev, RPI_REFCBASE0 + roff, + de->ref_addrs[i][1]); + apb_write_vc_len(dev, RPI_REFCSTRIDE0 + roff, + de->chroma_stride); + } + + apb_write(dev, RPI_CONFIG2, de->rpi_config2); + apb_write(dev, RPI_FRAMESIZE, de->rpi_framesize); + apb_write(dev, RPI_CURRPOC, de->rpi_currpoc); + + /* collocated reads/writes */ + apb_write_vc_len(dev, RPI_COLSTRIDE, + de->ctx->colmv_stride); + apb_write_vc_len(dev, RPI_MVSTRIDE, + de->ctx->colmv_stride); + apb_write_vc_addr(dev, RPI_MVBASE, + !de->frame_aux ? 0 : de->frame_aux->col.addr); + apb_write_vc_addr(dev, RPI_COLBASE, + !de->col_aux ? 0 : de->col_aux->col.addr); + + hevc_d_hw_irq_active2_irq(dev, &de->irq_ent, phase2_cb, de); + + apb_write_final(dev, RPI_NUMROWS, de->pic_height_in_ctbs_y); +} + +static void phase2_err_claimed(struct hevc_d_dev *const dev, void *v) +{ + phase2_done(dev, v, VB2_BUF_STATE_ERROR); +} + +static void phase1_claimed(struct hevc_d_dev *const dev, void *v); + +static void phase1_done(struct hevc_d_dev *const dev, + struct hevc_d_dec_env *const de, + enum vb2_buffer_state state) +{ + struct hevc_d_ctx *const ctx =3D de->ctx; + hevc_d_irq_callback p2_cb; + + p2_cb =3D (state =3D=3D VB2_BUF_STATE_DONE) ? phase2_claimed : + phase2_err_claimed; + v4l2_m2m_buf_done(de->src_buf, state); + de->src_buf =3D NULL; + + /* All phase1 error paths done - it is safe to inc p2idx */ + ctx->p2idx =3D (ctx->p2idx + 1 >=3D HEVC_D_P2BUF_COUNT) ? 0 : ctx->p2idx = + 1; + + /* Renable the next setup if we were blocking */ + if (atomic_add_return(-1, &ctx->p1out) >=3D HEVC_D_P1BUF_COUNT - 1) + v4l2_m2m_job_finish(dev->m2m_dev, ctx->fh.m2m_ctx); + + hevc_d_hw_irq_active2_claim(dev, &de->irq_ent, p2_cb, de); +} + +static void phase1_thread(struct hevc_d_dev *const dev, void *v) +{ + struct hevc_d_dec_env *const de =3D v; + struct hevc_d_ctx *const ctx =3D de->ctx; + + struct hevc_d_hwbuf *const pu_hwbuf =3D ctx->pu_bufs + ctx->p2idx; + struct hevc_d_hwbuf *const coeff_hwbuf =3D ctx->coeff_bufs + ctx->p2idx; + + if (de->p1_status & STATUS_PU_EXHAUSTED) { + if (hwbuf_realloc_new(dev, pu_hwbuf, next_size(pu_hwbuf->size))) { + v4l2_err(&dev->v4l2_dev, + "%s: PU realloc (%zx) failed\n", + __func__, pu_hwbuf->size); + goto fail; + } + hevc_d_dbg(1, &dev->v4l2_dev, "%s: PU realloc (%zx) OK\n", + __func__, pu_hwbuf->size); + } + + if (de->p1_status & STATUS_COEFF_EXHAUSTED) { + if (hwbuf_realloc_new(dev, coeff_hwbuf, + next_size(coeff_hwbuf->size))) { + v4l2_err(&dev->v4l2_dev, + "%s: Coeff realloc (%zx) failed\n", + __func__, coeff_hwbuf->size); + goto fail; + } + hevc_d_dbg(1, &dev->v4l2_dev, "%s: Coeff realloc (%zx) OK\n", + __func__, coeff_hwbuf->size); + } + + phase1_claimed(dev, de); + return; + +fail: + if (!pu_hwbuf->addr || !coeff_hwbuf->addr) { + v4l2_err(&dev->v4l2_dev, + "%s: Fatal: failed to reclaim old alloc\n", + __func__); + ctx->fatal_err =3D 1; + } + phase1_done(dev, de, VB2_BUF_STATE_ERROR); +} + +/* Always called in irq context (this is good) */ +static void phase1_cb(struct hevc_d_dev *const dev, void *v) +{ + struct hevc_d_dec_env *const de =3D v; + + de->p1_status =3D check_status(dev); + + if (de->p1_status !=3D 0) { + hevc_d_dbg(2, &dev->v4l2_dev, "%s: Post wait: %#x\n", + __func__, de->p1_status); + + if (de->p1_status < 0) + goto fail; + + /* Need to realloc - push onto a thread rather than IRQ */ + hevc_d_hw_irq_active1_thread(dev, &de->irq_ent, + phase1_thread, de); + return; + } + + phase1_done(dev, de, VB2_BUF_STATE_DONE); + return; + +fail: + phase1_done(dev, de, VB2_BUF_STATE_ERROR); +} + +static void phase1_claimed(struct hevc_d_dev *const dev, void *v) +{ + struct hevc_d_dec_env *const de =3D v; + struct hevc_d_ctx *const ctx =3D de->ctx; + + const struct hevc_d_hwbuf * const pu_hwbuf =3D ctx->pu_bufs + ctx->p2idx; + const struct hevc_d_hwbuf * const coeff_hwbuf =3D ctx->coeff_bufs + + ctx->p2idx; + + if (ctx->fatal_err) + goto fail; + + de->pu_base_vc =3D pu_hwbuf->addr; + de->pu_stride =3D + ALIGN_DOWN(pu_hwbuf->size / de->pic_height_in_ctbs_y, 64); + + de->coeff_base_vc =3D coeff_hwbuf->addr; + de->coeff_stride =3D + ALIGN_DOWN(coeff_hwbuf->size / de->pic_height_in_ctbs_y, 64); + + /* phase1_claimed blocked until cb_phase1 completed so p2idx inc + * in cb_phase1 after error detection + */ + + apb_write_vc_addr(dev, RPI_PUWBASE, de->pu_base_vc); + apb_write_vc_len(dev, RPI_PUWSTRIDE, de->pu_stride); + apb_write_vc_addr(dev, RPI_COEFFWBASE, de->coeff_base_vc); + apb_write_vc_len(dev, RPI_COEFFWSTRIDE, de->coeff_stride); + + /* Trigger command FIFO */ + apb_write(dev, RPI_CFNUM, de->cmd_len); + + /* Claim irq */ + hevc_d_hw_irq_active1_irq(dev, &de->irq_ent, phase1_cb, de); + + /* Start the h/w */ + apb_write_vc_addr_final(dev, RPI_CFBASE, de->cmd.addr); + return; + +fail: + phase1_done(dev, de, VB2_BUF_STATE_ERROR); +} + +static void phase1_err_claimed(struct hevc_d_dev *const dev, void *v) +{ + phase1_done(dev, v, VB2_BUF_STATE_ERROR); +} + +static void dec_state_delete(struct hevc_d_ctx *const ctx) +{ + unsigned int i; + struct hevc_d_dec_state *const s =3D ctx->state; + + if (!s) + return; + ctx->state =3D NULL; + + free_ps_info(s); + + for (i =3D 0; i !=3D HEVC_MAX_REFS; ++i) + aux_q_release(ctx, &s->ref_aux[i]); + aux_q_release(ctx, &s->frame_aux); + + kfree(s); +} + +struct irq_sync { + atomic_t done; + wait_queue_head_t wq; + struct hevc_d_hw_irq_ent irq_ent; +}; + +static void phase2_sync_claimed(struct hevc_d_dev *const dev, void *v) +{ + struct irq_sync *const sync =3D v; + + atomic_set(&sync->done, 1); + wake_up(&sync->wq); +} + +static void phase1_sync_claimed(struct hevc_d_dev *const dev, void *v) +{ + struct irq_sync *const sync =3D v; + + hevc_d_hw_irq_active1_enable_claim(dev, 1); + hevc_d_hw_irq_active2_claim(dev, &sync->irq_ent, phase2_sync_claimed, syn= c); +} + +/* Sync with IRQ operations + * + * Claims phase1 and phase2 in turn and waits for the phase2 claim so any + * pending IRQ ops will have completed by the time this returns + * + * phase1 has counted enables so must reenable once claimed + * phase2 has unlimited enables + */ +static void irq_sync(struct hevc_d_dev *const dev) +{ + struct irq_sync sync; + + atomic_set(&sync.done, 0); + init_waitqueue_head(&sync.wq); + + hevc_d_hw_irq_active1_claim(dev, &sync.irq_ent, phase1_sync_claimed, &syn= c); + wait_event(sync.wq, atomic_read(&sync.done)); +} + +static void h265_ctx_uninit(struct hevc_d_dev *const dev, struct hevc_d_ct= x *ctx) +{ + unsigned int i; + + dec_env_uninit(ctx); + dec_state_delete(ctx); + + /* + * dec_env & state must be killed before this to release the buffer to + * the free pool + */ + aux_q_uninit(ctx); + + for (i =3D 0; i !=3D ARRAY_SIZE(ctx->pu_bufs); ++i) + hwbuf_free(dev, ctx->pu_bufs + i); + for (i =3D 0; i !=3D ARRAY_SIZE(ctx->coeff_bufs); ++i) + hwbuf_free(dev, ctx->coeff_bufs + i); +} + +void hevc_d_h265_stop(struct hevc_d_ctx *ctx) +{ + struct hevc_d_dev *const dev =3D ctx->dev; + + irq_sync(dev); + h265_ctx_uninit(dev, ctx); +} + +int hevc_d_h265_start(struct hevc_d_ctx *ctx) +{ + struct hevc_d_dev *const dev =3D ctx->dev; + unsigned int i; + + const unsigned int wxh =3D ctx->dst_fmt.width * ctx->dst_fmt.height; + size_t pu_size; + size_t coeff_size; + + ctx->fatal_err =3D 0; + ctx->dec0 =3D NULL; + ctx->state =3D kzalloc_obj(*ctx->state, GFP_KERNEL); + if (!ctx->state) { + v4l2_err(&dev->v4l2_dev, "Failed to allocate decode state\n"); + goto fail; + } + + if (dec_env_init(ctx) !=3D 0) { + v4l2_err(&dev->v4l2_dev, "Failed to allocate decode envs\n"); + goto fail; + } + + coeff_size =3D hevc_d_round_up_size(wxh); + pu_size =3D hevc_d_round_up_size(wxh / 4); + for (i =3D 0; i !=3D ARRAY_SIZE(ctx->pu_bufs); ++i) { + /* Don't actually need a kernel mapping here */ + if (hwbuf_alloc(dev, ctx->pu_bufs + i, pu_size, + DMA_ATTR_NO_KERNEL_MAPPING)) { + v4l2_err(&dev->v4l2_dev, + "Failed to alloc %#zx PU%d buffer\n", + pu_size, i); + goto fail; + } + if (hwbuf_alloc(dev, ctx->coeff_bufs + i, coeff_size, + DMA_ATTR_NO_KERNEL_MAPPING)) { + v4l2_err(&dev->v4l2_dev, + "Failed to alloc %#zx Coeff%d buffer\n", + pu_size, i); + goto fail; + } + } + aux_q_init(ctx); + + return 0; + +fail: + h265_ctx_uninit(dev, ctx); + return -ENOMEM; +} + +void hevc_d_h265_trigger(struct hevc_d_ctx *ctx) +{ + struct hevc_d_dev *const dev =3D ctx->dev; + struct hevc_d_dec_env *const de =3D ctx->dec0; + struct vb2_v4l2_buffer *src_buf; + struct media_request *req; + hevc_d_irq_callback p1_cb; + + p1_cb =3D (de->state =3D=3D HEVC_D_DECODE_PHASE1) ? phase1_claimed : + phase1_err_claimed; + + src_buf =3D v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); + req =3D src_buf->vb2_buf.req_obj.req; + ctx->dec0 =3D NULL; + + /* We know we have src & dst so no need to test */ + de->src_buf =3D v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + de->frame_buf =3D v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + de->request =3D req; + + /* Enable the next setup if our Q isn't too big */ + if (atomic_add_return(1, &ctx->p1out) < HEVC_D_P1BUF_COUNT) + v4l2_m2m_job_finish(dev->m2m_dev, ctx->fh.m2m_ctx); + + hevc_d_hw_irq_active1_claim(dev, &de->irq_ent, p1_cb, de); +} + +static int try_ctrl_sps(struct v4l2_ctrl *ctrl) +{ + const struct v4l2_ctrl_hevc_sps *const sps =3D ctrl->p_new.p_hevc_sps; + struct hevc_d_ctx *const ctx =3D ctrl->priv; + struct hevc_d_dev *const dev =3D ctx->dev; + + if (sps->chroma_format_idc !=3D 1) { + v4l2_warn(&dev->v4l2_dev, + "Chroma format (%d) unsupported\n", + sps->chroma_format_idc); + return -EINVAL; + } + + if (sps->bit_depth_luma_minus8 !=3D 0 && + sps->bit_depth_luma_minus8 !=3D 2) { + v4l2_warn(&dev->v4l2_dev, + "Luma depth (%d) unsupported\n", + sps->bit_depth_luma_minus8 + 8); + return -EINVAL; + } + + if (sps->bit_depth_luma_minus8 !=3D sps->bit_depth_chroma_minus8) { + v4l2_warn(&dev->v4l2_dev, + "Chroma depth (%d) !=3D Luma depth (%d)\n", + sps->bit_depth_chroma_minus8 + 8, + sps->bit_depth_luma_minus8 + 8); + return -EINVAL; + } + + if (sps->pic_width_in_luma_samples < HEVC_D_MIN_WIDTH || + sps->pic_height_in_luma_samples < HEVC_D_MIN_HEIGHT || + sps->pic_width_in_luma_samples > HEVC_D_MAX_WIDTH || + sps->pic_height_in_luma_samples > HEVC_D_MAX_HEIGHT) { + v4l2_warn(&dev->v4l2_dev, + "Bad sps width (%u) x height (%u)\n", + sps->pic_width_in_luma_samples, + sps->pic_height_in_luma_samples); + return -EINVAL; + } + + return 0; +} + +const struct v4l2_ctrl_ops hevc_d_hevc_sps_ctrl_ops =3D { + .try_ctrl =3D try_ctrl_sps, +}; + +static int try_ctrl_pps(struct v4l2_ctrl *ctrl) +{ + const struct v4l2_ctrl_hevc_pps *const pps =3D ctrl->p_new.p_hevc_pps; + struct hevc_d_ctx *const ctx =3D ctrl->priv; + struct hevc_d_dev *const dev =3D ctx->dev; + + if ((pps->flags & + V4L2_HEVC_PPS_FLAG_ENTROPY_CODING_SYNC_ENABLED) && + (pps->flags & + V4L2_HEVC_PPS_FLAG_TILES_ENABLED) && + (pps->num_tile_columns_minus1 || pps->num_tile_rows_minus1)) { + v4l2_warn(&dev->v4l2_dev, + "WPP + Tiles not supported\n"); + return -EINVAL; + } + + return 0; +} + +const struct v4l2_ctrl_ops hevc_d_hevc_pps_ctrl_ops =3D { + .try_ctrl =3D try_ctrl_pps, +}; + +void hevc_d_device_run(void *priv) +{ + struct hevc_d_ctx *const ctx =3D priv; + struct hevc_d_dev *const dev =3D ctx->dev; + struct hevc_d_run run =3D {}; + struct media_request *src_req; + const struct v4l2_ctrl *ctrl; + + run.src =3D v4l2_m2m_next_src_buf(ctx->fh.m2m_ctx); + run.dst =3D v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx); + + /* Apply request(s) controls */ + src_req =3D run.src->vb2_buf.req_obj.req; + if (v4l2_ctrl_request_setup(src_req, &ctx->hdl)) + goto fail; + + run.h265.sps =3D + hevc_d_find_control_data(ctx, + V4L2_CID_STATELESS_HEVC_SPS); + run.h265.pps =3D + hevc_d_find_control_data(ctx, + V4L2_CID_STATELESS_HEVC_PPS); + run.h265.dec =3D + hevc_d_find_control_data(ctx, + V4L2_CID_STATELESS_HEVC_DECODE_PARAMS); + + ctrl =3D v4l2_ctrl_find(ctx->fh.ctrl_handler, + V4L2_CID_STATELESS_HEVC_SLICE_PARAMS); + if (!ctrl || !ctrl->elems) { + v4l2_err(&dev->v4l2_dev, "%s: Missing slice params\n", + __func__); + goto fail; + } + run.h265.slice_ents =3D ctrl->elems; + run.h265.slice_params =3D ctrl->p_cur.p; + + run.h265.scaling_matrix =3D + hevc_d_find_control_data(ctx, + V4L2_CID_STATELESS_HEVC_SCALING_MATRIX); + + v4l2_m2m_buf_copy_metadata(run.src, run.dst); + + if (hevc_d_h265_setup(ctx, &run) =3D=3D -1) + goto fail; + + /* Complete request(s) controls */ + v4l2_ctrl_request_complete(src_req, &ctx->hdl); + + hevc_d_h265_trigger(ctx); + return; + +fail: + /* We really shouldn't get here but tidy up what we can */ + v4l2_m2m_buf_done_and_job_finish(dev->m2m_dev, ctx->fh.m2m_ctx, + VB2_BUF_STATE_ERROR); + media_request_manual_complete(src_req); +} diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_h265.h b/dr= ivers/media/platform/raspberrypi/hevc_dec/hevc_d_h265.h new file mode 100644 index 000000000000..3f0e0ecda9fe --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_h265.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + */ + +#ifndef _HEVC_D_H265_H_ +#define _HEVC_D_H265_H_ +#include "hevc_d.h" + +extern const struct v4l2_ctrl_ops hevc_d_hevc_sps_ctrl_ops; +extern const struct v4l2_ctrl_ops hevc_d_hevc_pps_ctrl_ops; + +int hevc_d_h265_start(struct hevc_d_ctx *ctx); +void hevc_d_h265_stop(struct hevc_d_ctx *ctx); +void hevc_d_h265_trigger(struct hevc_d_ctx *ctx); + +void hevc_d_device_run(void *priv); + +#endif diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_hw.c b/driv= ers/media/platform/raspberrypi/hevc_dec/hevc_d_hw.c new file mode 100644 index 000000000000..8e274ff25d8e --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_hw.c @@ -0,0 +1,429 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include + +#include "hevc_d.h" +#include "hevc_d_hw.h" + +static void pre_irq(struct hevc_d_dev *dev, struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback cb, void *v, + struct hevc_d_hw_irq_ctrl *ictl) +{ + unsigned long flags; + + if (ictl->irq) { + v4l2_err(&dev->v4l2_dev, "Attempt to claim IRQ when already claimed\n"); + return; + } + + ient->cb =3D cb; + ient->v =3D v; + + spin_lock_irqsave(&ictl->lock, flags); + ictl->irq =3D ient; + ictl->no_sched++; + spin_unlock_irqrestore(&ictl->lock, flags); +} + +/* Should be called from inside ictl->lock */ +static inline bool sched_enabled(const struct hevc_d_hw_irq_ctrl * const i= ctl) +{ + return ictl->no_sched <=3D 0 && ictl->enable; +} + +/* Should be called from inside ictl->lock & after checking sched_enabled(= ) */ +static inline void set_claimed(struct hevc_d_hw_irq_ctrl * const ictl) +{ + if (ictl->enable > 0) + --ictl->enable; + ictl->no_sched =3D 1; +} + +/* Should be called from inside ictl->lock */ +static struct hevc_d_hw_irq_ent *get_sched(struct hevc_d_hw_irq_ctrl * con= st ictl) +{ + struct hevc_d_hw_irq_ent *ient; + + if (!sched_enabled(ictl)) + return NULL; + + ient =3D ictl->claim; + if (!ient) + return NULL; + ictl->claim =3D ient->next; + + set_claimed(ictl); + return ient; +} + +/* Run a callback & check to see if there is anything else to run */ +static void sched_cb(struct hevc_d_dev * const dev, + struct hevc_d_hw_irq_ctrl * const ictl, + struct hevc_d_hw_irq_ent *ient) +{ + while (ient) { + unsigned long flags; + + ient->cb(dev, ient->v); + + spin_lock_irqsave(&ictl->lock, flags); + + /* + * Always dec no_sched after cb exec - must have been set + * on entry to cb + */ + --ictl->no_sched; + ient =3D get_sched(ictl); + + spin_unlock_irqrestore(&ictl->lock, flags); + } +} + +/* Should only ever be called from its own IRQ cb so no lock required */ +static void pre_thread(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback cb, void *v, + struct hevc_d_hw_irq_ctrl *ictl) +{ + ient->cb =3D cb; + ient->v =3D v; + ictl->irq =3D ient; + ictl->thread_reqed =3D true; + ictl->no_sched++; /* This is unwound in do_thread */ +} + +/* Called in irq context */ +static void do_irq(struct hevc_d_dev * const dev, + struct hevc_d_hw_irq_ctrl * const ictl) +{ + struct hevc_d_hw_irq_ent *ient; + unsigned long flags; + + spin_lock_irqsave(&ictl->lock, flags); + ient =3D ictl->irq; + ictl->irq =3D NULL; + spin_unlock_irqrestore(&ictl->lock, flags); + + sched_cb(dev, ictl, ient); +} + +static void do_claim(struct hevc_d_dev * const dev, + struct hevc_d_hw_irq_ent *ient, + const hevc_d_irq_callback cb, void * const v, + struct hevc_d_hw_irq_ctrl * const ictl) +{ + unsigned long flags; + + ient->next =3D NULL; + ient->cb =3D cb; + ient->v =3D v; + + spin_lock_irqsave(&ictl->lock, flags); + + if (ictl->claim) { + /* If we have a Q then add to end */ + ictl->tail->next =3D ient; + ictl->tail =3D ient; + ient =3D NULL; + } else if (!sched_enabled(ictl)) { + /* Empty Q but other activity in progress so Q */ + ictl->claim =3D ient; + ictl->tail =3D ient; + ient =3D NULL; + } else { + /* + * Nothing else going on - schedule immediately and + * prevent anything else scheduling claims + */ + set_claimed(ictl); + } + + spin_unlock_irqrestore(&ictl->lock, flags); + + sched_cb(dev, ictl, ient); +} + +/* Enable n claims. + * n < 0 set to unlimited (default on init) + * n =3D 0 if previously unlimited then disable otherwise nop + * n > 0 if previously unlimited then set to n enables + * otherwise add n enables + * The enable count is automatically decremented every time a claim is run + */ +static void do_enable_claim(struct hevc_d_dev * const dev, + int n, + struct hevc_d_hw_irq_ctrl * const ictl) +{ + unsigned long flags; + struct hevc_d_hw_irq_ent *ient; + + spin_lock_irqsave(&ictl->lock, flags); + ictl->enable =3D n < 0 ? -1 : ictl->enable <=3D 0 ? n : ictl->enable + n; + ient =3D get_sched(ictl); + spin_unlock_irqrestore(&ictl->lock, flags); + + sched_cb(dev, ictl, ient); +} + +static void ictl_init(struct hevc_d_hw_irq_ctrl * const ictl, int enables) +{ + spin_lock_init(&ictl->lock); + ictl->claim =3D NULL; + ictl->tail =3D NULL; + ictl->irq =3D NULL; + ictl->no_sched =3D 0; + ictl->enable =3D enables; + ictl->thread_reqed =3D false; +} + +static void ictl_uninit(struct hevc_d_hw_irq_ctrl * const ictl) +{ + /* Nothing to do */ +} + +static irqreturn_t hevc_d_irq_irq(int irq, void *data) +{ + struct hevc_d_dev * const dev =3D data; + __u32 ictrl; + + ictrl =3D irq_read(dev, ARG_IC_ICTRL); + if (!(ictrl & ARG_IC_ICTRL_ALL_IRQ_MASK)) { + v4l2_warn(&dev->v4l2_dev, "IRQ but no IRQ bits set\n"); + return IRQ_NONE; + } + + /* Cancel any/all irqs */ + irq_write(dev, ARG_IC_ICTRL, ictrl & ~ARG_IC_ICTRL_SET_ZERO_MASK); + + /* + * Service Active2 before Active1 so Phase 1 can transition to Phase 2 + * without delay + */ + if (ictrl & ARG_IC_ICTRL_ACTIVE2_INT_SET) + do_irq(dev, &dev->ic_active2); + if (ictrl & ARG_IC_ICTRL_ACTIVE1_INT_SET) + do_irq(dev, &dev->ic_active1); + + return dev->ic_active1.thread_reqed || dev->ic_active2.thread_reqed ? + IRQ_WAKE_THREAD : IRQ_HANDLED; +} + +static void do_thread(struct hevc_d_dev * const dev, + struct hevc_d_hw_irq_ctrl *const ictl) +{ + unsigned long flags; + struct hevc_d_hw_irq_ent *ient =3D NULL; + + spin_lock_irqsave(&ictl->lock, flags); + + if (ictl->thread_reqed) { + ient =3D ictl->irq; + ictl->thread_reqed =3D false; + ictl->irq =3D NULL; + } + + spin_unlock_irqrestore(&ictl->lock, flags); + + sched_cb(dev, ictl, ient); +} + +static irqreturn_t hevc_d_irq_thread(int irq, void *data) +{ + struct hevc_d_dev * const dev =3D data; + + do_thread(dev, &dev->ic_active1); + do_thread(dev, &dev->ic_active2); + + return IRQ_HANDLED; +} + +/* + * May only be called from Active1 CB + * IRQs should not be expected until execution continues in the cb + */ +void hevc_d_hw_irq_active1_thread(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback thread_cb, void *ctx) +{ + pre_thread(dev, ient, thread_cb, ctx, &dev->ic_active1); +} + +void hevc_d_hw_irq_active1_enable_claim(struct hevc_d_dev *dev, + int n) +{ + do_enable_claim(dev, n, &dev->ic_active1); +} + +void hevc_d_hw_irq_active1_claim(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback ready_cb, void *ctx) +{ + do_claim(dev, ient, ready_cb, ctx, &dev->ic_active1); +} + +void hevc_d_hw_irq_active1_irq(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback irq_cb, void *ctx) +{ + pre_irq(dev, ient, irq_cb, ctx, &dev->ic_active1); +} + +void hevc_d_hw_irq_active2_claim(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback ready_cb, void *ctx) +{ + do_claim(dev, ient, ready_cb, ctx, &dev->ic_active2); +} + +void hevc_d_hw_irq_active2_irq(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback irq_cb, void *ctx) +{ + pre_irq(dev, ient, irq_cb, ctx, &dev->ic_active2); +} + +/* + * Stop the clock for this context + * clk_disable_unprepare does ref counting so this will not actually + * disable the clock if there are other running contexts + */ +void hevc_d_hw_stop_clock(struct hevc_d_dev *dev) +{ + clk_disable_unprepare(dev->clock); +} + +/* Always starts the clock if it isn't already on this ctx */ +int hevc_d_hw_start_clock(struct hevc_d_dev *dev) +{ + int rv; + + rv =3D clk_set_min_rate(dev->clock, dev->max_clock_rate); + if (rv) { + dev_err(dev->dev, "Failed to set clock rate\n"); + return rv; + } + + rv =3D clk_prepare_enable(dev->clock); + if (rv) { + dev_err(dev->dev, "Failed to enable clock\n"); + return rv; + } + return 0; +} + +static int hw_setup(struct hevc_d_dev *dev) +{ + struct device_node *node; + u32 ver; + u32 irq_stat; + struct rpi_firmware *firmware; + + ver =3D apb_read(dev, RPI_VERSION); + if (ver !=3D 0x202) { + dev_err(dev->dev, "Unexpected version %#x only 0x202 supported\n", ver); + return -ENODEV; + } + + node =3D rpi_firmware_find_node(); + if (!node) + return -EINVAL; + + firmware =3D rpi_firmware_get(node); + of_node_put(node); + if (!firmware) + return -EPROBE_DEFER; + + dev->max_clock_rate =3D rpi_firmware_clk_get_max_rate(firmware, + RPI_FIRMWARE_HEVC_CLK_ID); + rpi_firmware_put(firmware); + + /* + * Enable IRQs & reset anything pending + * Whilst this seems the wrong way round the h/w doesn't actually + * set the IRQ status bits till the IRQs are enabled. As we haven't + * got the IRQ yet this should still be safe. + */ + irq_write(dev, ARG_IC_ICTRL, + ARG_IC_ICTRL_ACTIVE1_EN_SET | ARG_IC_ICTRL_ACTIVE2_EN_SET); + irq_stat =3D irq_read(dev, ARG_IC_ICTRL); + irq_write(dev, ARG_IC_ICTRL, irq_stat); + + return 0; +} + +int hevc_d_hw_probe(struct hevc_d_dev *dev) +{ + int irq_dec; + int ret; + + ictl_init(&dev->ic_active1, HEVC_D_P2BUF_COUNT); + ictl_init(&dev->ic_active2, HEVC_D_ICTL_ENABLE_UNLIMITED); + + dev->base_irq =3D devm_platform_ioremap_resource_byname(dev->pdev, "intc"= ); + if (IS_ERR(dev->base_irq)) + return PTR_ERR(dev->base_irq); + + dev->base_h265 =3D devm_platform_ioremap_resource_byname(dev->pdev, "hevc= "); + if (IS_ERR(dev->base_h265)) + return PTR_ERR(dev->base_h265); + + dev->clock =3D devm_clk_get(&dev->pdev->dev, NULL); + if (IS_ERR(dev->clock)) + return PTR_ERR(dev->clock); + + ret =3D clk_prepare_enable(dev->clock); + if (ret) + return ret; + ret =3D hw_setup(dev); + clk_disable_unprepare(dev->clock); + if (ret) + return ret; + + irq_dec =3D platform_get_irq(dev->pdev, 0); + if (irq_dec <=3D 0) + return irq_dec; + ret =3D devm_request_threaded_irq(dev->dev, irq_dec, + hevc_d_irq_irq, + hevc_d_irq_thread, + 0, dev_name(dev->dev), dev); + if (ret) + dev_err(dev->dev, "Failed to request IRQ - %d\n", ret); + + return ret; +} + +void hevc_d_hw_remove(struct hevc_d_dev *dev) +{ + /* + * IRQ auto freed on unload so no need to do it here + * ioremap auto freed on unload + */ + ictl_uninit(&dev->ic_active1); + ictl_uninit(&dev->ic_active2); +} + diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_hw.h b/driv= ers/media/platform/raspberrypi/hevc_dec/hevc_d_hw.h new file mode 100644 index 000000000000..fedec3e0c10b --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_hw.h @@ -0,0 +1,317 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ + +#ifndef _HEVC_D_HW_H_ +#define _HEVC_D_HW_H_ + +struct hevc_d_hw_irq_ent { + struct hevc_d_hw_irq_ent *next; + hevc_d_irq_callback cb; + void *v; +}; + +/* Phase 1 Register offsets */ + +#define RPI_SPS0 0 +#define RPI_SPS1 4 +#define RPI_PPS 8 +#define RPI_SLICE 12 +#define RPI_TILESTART 16 +#define RPI_TILEEND 20 +#define RPI_SLICESTART 24 +#define RPI_MODE 28 +#define RPI_LEFT0 32 +#define RPI_LEFT1 36 +#define RPI_LEFT2 40 +#define RPI_LEFT3 44 +#define RPI_QP 48 +#define RPI_CONTROL 52 +#define RPI_STATUS 56 +#define RPI_VERSION 60 +#define RPI_BFBASE 64 +#define RPI_BFNUM 68 +#define RPI_BFCONTROL 72 +#define RPI_BFSTATUS 76 +#define RPI_PUWBASE 80 +#define RPI_PUWSTRIDE 84 +#define RPI_COEFFWBASE 88 +#define RPI_COEFFWSTRIDE 92 +#define RPI_SLICECMDS 96 +#define RPI_BEGINTILEEND 100 +#define RPI_TRANSFER 104 +#define RPI_CFBASE 108 +#define RPI_CFNUM 112 +#define RPI_CFSTATUS 116 + +/* Phase 2 Register offsets */ + +#define RPI_PURBASE 0x8000 +#define RPI_PURSTRIDE 0x8004 +#define RPI_COEFFRBASE 0x8008 +#define RPI_COEFFRSTRIDE 0x800C +#define RPI_NUMROWS 0x8010 +#define RPI_CONFIG2 0x8014 +#define RPI_OUTYBASE 0x8018 +#define RPI_OUTYSTRIDE 0x801C +#define RPI_OUTCBASE 0x8020 +#define RPI_OUTCSTRIDE 0x8024 +#define RPI_STATUS2 0x8028 +#define RPI_FRAMESIZE 0x802C +#define RPI_MVBASE 0x8030 +#define RPI_MVSTRIDE 0x8034 +#define RPI_COLBASE 0x8038 +#define RPI_COLSTRIDE 0x803C +#define RPI_CURRPOC 0x8040 + +/* + * Reference frame register values + * There are 16 of these arranged sequentially + */ +#define RPI_REFYBASE0 0x9000 +#define RPI_REFYSTRIDE0 0x9004 +#define RPI_REFCBASE0 0x9008 +#define RPI_REFCSTRIDE0 0x900c +/* Offset to get from REFYBASEn to REFYBASEn+1 */ +#define RPI_REFREGS_SIZE 16 + +/* + * Write a general register value + * Order is unimportant + */ +static inline void apb_write(const struct hevc_d_dev * const dev, + const unsigned int offset, const u32 val) +{ + writel_relaxed(val, dev->base_h265 + offset); +} + +/* Write the final register value that actually starts the phase */ +static inline void apb_write_final(const struct hevc_d_dev * const dev, + const unsigned int offset, const u32 val) +{ + writel(val, dev->base_h265 + offset); +} + +static inline u32 apb_read(const struct hevc_d_dev * const dev, + const unsigned int offset) +{ + return readl(dev->base_h265 + offset); +} + +static inline void irq_write(const struct hevc_d_dev * const dev, + const unsigned int offset, const u32 val) +{ + writel(val, dev->base_irq + offset); +} + +static inline u32 irq_read(const struct hevc_d_dev * const dev, + const unsigned int offset) +{ + return readl(dev->base_irq + offset); +} + +static inline void apb_write_vc_addr(const struct hevc_d_dev * const dev, + const unsigned int offset, + const dma_addr_t a) +{ + apb_write(dev, offset, (u32)(a >> 6)); +} + +static inline void apb_write_vc_addr_final(const struct hevc_d_dev * const= dev, + const unsigned int offset, + const dma_addr_t a) +{ + apb_write_final(dev, offset, (u32)(a >> 6)); +} + +static inline void apb_write_vc_len(const struct hevc_d_dev * const dev, + const unsigned int offset, + const unsigned int x) +{ + apb_write(dev, offset, (x + 63) >> 6); +} + +/* *ARG_IC_ICTRL - Interrupt control for ARGON Core* + * Offset (byte space) =3D 40'h2b10000 + * Physical Address (byte space) =3D 40'h7eb10000 + * Verilog Macro Address =3D `ARG_IC_REG_START + `ARGON_INTCTRL_ICTRL + * Reset Value =3D 32'b100x100x_100xxxxx_xxxxxxx0_x100x100 + * Access =3D RW (32-bit only) + * Interrupt control logic for ARGON Core. + */ +#define ARG_IC_ICTRL 0 + +/* acc=3DLWC ACTIVE1_INT FIELD ACCESS: LWC + * + * Interrupt 1 + * This is set and held when an hevc_active1 interrupt edge is detected + * The polarity of the edge is set by the ACTIVE1_EDGE field + * Write a 1 to this bit to clear down the latched interrupt + * The latched interrupt is only enabled out onto the interrupt line if + * ACTIVE1_EN is set + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_ACTIVE1_INT_SET BIT(0) + +/* ACTIVE1_EDGE Sets the polarity of the interrupt edge detection logic + * This logic detects edges of the hevc_active1 line from the argon core + * 0 =3D negedge, 1 =3D posedge + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_ACTIVE1_EDGE_SET BIT(1) + +/* ACTIVE1_EN Enables ACTIVE1_INT out onto the argon interrupt line. + * If this isn't set, the interrupt logic will work but no interrupt will = be + * set to the interrupt controller + * Reset value is *1* decimal. + * + * [JC] The above appears to be a lie - if unset then b0 is never set + */ +#define ARG_IC_ICTRL_ACTIVE1_EN_SET BIT(2) + +/* acc=3DRO ACTIVE1_STATUS FIELD ACCESS: RO + * + * The current status of the hevc_active1 signal + */ +#define ARG_IC_ICTRL_ACTIVE1_STATUS_SET BIT(3) + +/* acc=3DLWC ACTIVE2_INT FIELD ACCESS: LWC + * + * Interrupt 2 + * This is set and held when an hevc_active2 interrupt edge is detected + * The polarity of the edge is set by the ACTIVE2_EDGE field + * Write a 1 to this bit to clear down the latched interrupt + * The latched interrupt is only enabled out onto the interrupt line if + * ACTIVE2_EN is set + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_ACTIVE2_INT_SET BIT(4) + +/* ACTIVE2_EDGE Sets the polarity of the interrupt edge detection logic + * This logic detects edges of the hevc_active2 line from the argon core + * 0 =3D negedge, 1 =3D posedge + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_ACTIVE2_EDGE_SET BIT(5) + +/* ACTIVE2_EN Enables ACTIVE2_INT out onto the argon interrupt line. + * If this isn't set, the interrupt logic will work but no interrupt will = be + * set to the interrupt controller + * Reset value is *1* decimal. + */ +#define ARG_IC_ICTRL_ACTIVE2_EN_SET BIT(6) + +/* acc=3DRO ACTIVE2_STATUS FIELD ACCESS: RO + * + * The current status of the hevc_active2 signal + */ +#define ARG_IC_ICTRL_ACTIVE2_STATUS_SET BIT(7) + +/* TEST_INT Forces the argon int high for test purposes. + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_TEST_INT BIT(8) +#define ARG_IC_ICTRL_SPARE BIT(9) + +/* acc=3DRO VP9_INTERRUPT_STATUS FIELD ACCESS: RO + * + * The current status of the vp9_interrupt signal + */ +#define ARG_IC_ICTRL_VP9_INTERRUPT_STATUS BIT(10) + +/* AIO_INT_ENABLE 1 =3D Or the AIO int in with the Argon int so the VPU ca= n see + * it + * 0 =3D the AIO int is masked. (It should still be connected to the GIC t= hough). + */ +#define ARG_IC_ICTRL_AIO_INT_ENABLE BIT(20) +#define ARG_IC_ICTRL_H264_ACTIVE_INT BIT(21) +#define ARG_IC_ICTRL_H264_ACTIVE_EDGE BIT(22) +#define ARG_IC_ICTRL_H264_ACTIVE_EN BIT(23) +#define ARG_IC_ICTRL_H264_ACTIVE_STATUS BIT(24) +#define ARG_IC_ICTRL_H264_INTERRUPT_INT BIT(25) +#define ARG_IC_ICTRL_H264_INTERRUPT_EDGE BIT(26) +#define ARG_IC_ICTRL_H264_INTERRUPT_EN BIT(27) + +/* acc=3DRO H264_INTERRUPT_STATUS FIELD ACCESS: RO + * + * The current status of the h264_interrupt signal + */ +#define ARG_IC_ICTRL_H264_INTERRUPT_STATUS BIT(28) + +/* acc=3DLWC VP9_INTERRUPT_INT FIELD ACCESS: LWC + * + * Interrupt 1 + * This is set and held when an vp9_interrupt interrupt edge is detected + * The polarity of the edge is set by the VP9_INTERRUPT_EDGE field + * Write a 1 to this bit to clear down the latched interrupt + * The latched interrupt is only enabled out onto the interrupt line if + * VP9_INTERRUPT_EN is set + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_VP9_INTERRUPT_INT BIT(29) + +/* VP9_INTERRUPT_EDGE Sets the polarity of the interrupt edge detection lo= gic + * This logic detects edges of the vp9_interrupt line from the argon h264 = core + * 0 =3D negedge, 1 =3D posedge + * Reset value is *0* decimal. + */ +#define ARG_IC_ICTRL_VP9_INTERRUPT_EDGE BIT(30) + +/* VP9_INTERRUPT_EN Enables VP9_INTERRUPT_INT out onto the argon interrupt= line. + * If this isn't set, the interrupt logic will work but no interrupt will = be + * set to the interrupt controller + * Reset value is *1* decimal. + */ +#define ARG_IC_ICTRL_VP9_INTERRUPT_EN BIT(31) + +/* Bits 19:12, 11 reserved - read ?, write 0 */ +#define ARG_IC_ICTRL_SET_ZERO_MASK ((0xff << 12) | BIT(11)) + +/* All IRQ bits */ +#define ARG_IC_ICTRL_ALL_IRQ_MASK (\ + ARG_IC_ICTRL_VP9_INTERRUPT_INT |\ + ARG_IC_ICTRL_H264_INTERRUPT_INT |\ + ARG_IC_ICTRL_ACTIVE1_INT_SET |\ + ARG_IC_ICTRL_ACTIVE2_INT_SET) + +/* Regulate claim Q */ +void hevc_d_hw_irq_active1_enable_claim(struct hevc_d_dev *dev, + int n); +/* Auto release once all CBs called */ +void hevc_d_hw_irq_active1_claim(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback ready_cb, void *ctx); +/* May only be called in claim cb */ +void hevc_d_hw_irq_active1_irq(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback irq_cb, void *ctx); +/* May only be called in irq cb */ +void hevc_d_hw_irq_active1_thread(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback thread_cb, void *ctx); + +/* Auto release once all CBs called */ +void hevc_d_hw_irq_active2_claim(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback ready_cb, void *ctx); +/* May only be called in claim cb */ +void hevc_d_hw_irq_active2_irq(struct hevc_d_dev *dev, + struct hevc_d_hw_irq_ent *ient, + hevc_d_irq_callback irq_cb, void *ctx); + +int hevc_d_hw_start_clock(struct hevc_d_dev *dev); +void hevc_d_hw_stop_clock(struct hevc_d_dev *dev); + +int hevc_d_hw_probe(struct hevc_d_dev *dev); +void hevc_d_hw_remove(struct hevc_d_dev *dev); + +#endif diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_video.c b/d= rivers/media/platform/raspberrypi/hevc_dec/hevc_d_video.c new file mode 100644 index 000000000000..d39a2e228595 --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_video.c @@ -0,0 +1,634 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ + +#include +#include +#include +#include +#include + +#include "hevc_d.h" +#include "hevc_d_h265.h" +#include "hevc_d_hw.h" +#include "hevc_d_video.h" + +static inline struct hevc_d_ctx *hevc_d_file2ctx(struct file *file) +{ + return container_of(file->private_data, struct hevc_d_ctx, fh); +} + +/* constrain x to y,y*2 */ +static inline unsigned int constrain2x(unsigned int x, unsigned int y) +{ + return (x < y) ? + y : + (x > y * 2) ? y : x; +} + +size_t hevc_d_round_up_size(const size_t x) +{ + /* Admit no size < 256 */ + const unsigned int n =3D x < 256 ? 8 : ilog2(x); + + return x >=3D (3 << n) ? 4 << n : (3 << n); +} + +size_t hevc_d_bit_buf_size(unsigned int w, unsigned int h, unsigned int bi= ts_minus8) +{ + const size_t wxh =3D w * h; + size_t bits_alloc; + + /* Annex A gives a min compression of 2 @ lvl 3.1 + * (wxh <=3D 983040) and min 4 thereafter but avoid + * the odity of 983041 having a lower limit than + * 983040. + * Multiply by 3/2 for 4:2:0 + */ + bits_alloc =3D wxh < 983040 ? wxh * 3 / 4 : + wxh < 983040 * 2 ? 983040 * 3 / 4 : + wxh * 3 / 8; + /* Allow for bit depth */ + bits_alloc +=3D (bits_alloc * bits_minus8) / 8; + return hevc_d_round_up_size(bits_alloc); +} + +void hevc_d_prepare_src_format(struct v4l2_pix_format_mplane *pix_fmt) +{ + size_t size; + u32 w; + u32 h; + + w =3D pix_fmt->width; + h =3D pix_fmt->height; + if (!w || !h) { + w =3D HEVC_D_DEFAULT_WIDTH; + h =3D HEVC_D_DEFAULT_HEIGHT; + } + if (w > HEVC_D_MAX_WIDTH) + w =3D HEVC_D_MAX_WIDTH; + if (h > HEVC_D_MAX_HEIGHT) + h =3D HEVC_D_MAX_HEIGHT; + + if (!pix_fmt->plane_fmt[0].sizeimage || + pix_fmt->plane_fmt[0].sizeimage > SZ_32M) { + /* Unspecified or way too big - pick max for size */ + size =3D hevc_d_bit_buf_size(w, h, 2); + } + /* Set a minimum */ + size =3D max_t(u32, SZ_4K, pix_fmt->plane_fmt[0].sizeimage); + + pix_fmt->pixelformat =3D V4L2_PIX_FMT_HEVC_SLICE; + pix_fmt->width =3D w; + pix_fmt->height =3D h; + pix_fmt->num_planes =3D 1; + pix_fmt->field =3D V4L2_FIELD_NONE; + /* Zero bytes per line for encoded source. */ + pix_fmt->plane_fmt[0].bytesperline =3D 0; + pix_fmt->plane_fmt[0].sizeimage =3D size; +} + +/* Take any pix_format and make it valid */ +static void hevc_d_prepare_dst_format(struct v4l2_pix_format_mplane *pix_f= mt) +{ + unsigned int width =3D pix_fmt->width; + unsigned int height =3D pix_fmt->height; + unsigned int sizeimage =3D pix_fmt->plane_fmt[0].sizeimage; + unsigned int bytesperline =3D pix_fmt->plane_fmt[0].bytesperline; + + if (!width) + width =3D HEVC_D_DEFAULT_WIDTH; + else + width =3D clamp(width, HEVC_D_MIN_WIDTH, HEVC_D_MAX_WIDTH); + if (!height) + height =3D HEVC_D_DEFAULT_HEIGHT; + else + height =3D clamp(height, HEVC_D_MIN_HEIGHT, HEVC_D_MAX_HEIGHT); + + /* For column formats set bytesperline to column height (stride2) */ + switch (pix_fmt->pixelformat) { + default: + pix_fmt->pixelformat =3D V4L2_PIX_FMT_NV12MT_COL128; + fallthrough; + case V4L2_PIX_FMT_NV12MT_COL128: + /* Width rounds up to columns */ + width =3D ALIGN(width, 128); + height =3D ALIGN(height, 8); + + /* column height is sizeimage / bytesperline */ + bytesperline =3D width; + sizeimage =3D bytesperline * height; + break; + + case V4L2_PIX_FMT_NV12MT_10_COL128: + /* width in pixels (3 pels =3D 4 bytes) rounded to 128 byte + * columns + */ + width =3D ALIGN(((width + 2) / 3), 32) * 3; + height =3D ALIGN(height, 8); + + /* column height is sizeimage / bytesperline */ + bytesperline =3D width * 4 / 3; + sizeimage =3D bytesperline * height; + break; + } + + pix_fmt->width =3D width; + pix_fmt->height =3D height; + + pix_fmt->field =3D V4L2_FIELD_NONE; + pix_fmt->plane_fmt[0].bytesperline =3D bytesperline; + pix_fmt->plane_fmt[0].sizeimage =3D sizeimage; + pix_fmt->plane_fmt[1].bytesperline =3D bytesperline; + pix_fmt->plane_fmt[1].sizeimage =3D sizeimage / 2; + pix_fmt->num_planes =3D 2; +} + +static int hevc_d_querycap(struct file *file, void *priv, + struct v4l2_capability *cap) +{ + strscpy(cap->driver, HEVC_D_NAME, sizeof(cap->driver)); + strscpy(cap->card, HEVC_D_NAME, sizeof(cap->card)); + snprintf(cap->bus_info, sizeof(cap->bus_info), + "platform:%s", HEVC_D_NAME); + + return 0; +} + +static int hevc_d_enum_fmt_vid_out(struct file *file, void *priv, + struct v4l2_fmtdesc *f) +{ + /* + * Input formats + * H.265 Slice only + */ + if (f->index =3D=3D 0) { + f->pixelformat =3D V4L2_PIX_FMT_HEVC_SLICE; + return 0; + } + + return -EINVAL; +} + +static int hevc_d_hevc_validate_sps(const struct v4l2_ctrl_hevc_sps * cons= t sps) +{ + const unsigned int ctb_log2_size_y =3D + sps->log2_min_luma_coding_block_size_minus3 + 3 + + sps->log2_diff_max_min_luma_coding_block_size; + const unsigned int min_tb_log2_size_y =3D + sps->log2_min_luma_transform_block_size_minus2 + 2; + const unsigned int max_tb_log2_size_y =3D min_tb_log2_size_y + + sps->log2_diff_max_min_luma_transform_block_size; + + /* Local limitations */ + if (sps->pic_width_in_luma_samples < 32 || + sps->pic_width_in_luma_samples > 4096) + return 0; + if (sps->pic_height_in_luma_samples < 32 || + sps->pic_height_in_luma_samples > 4096) + return 0; + if (!(sps->bit_depth_luma_minus8 =3D=3D 0 || + sps->bit_depth_luma_minus8 =3D=3D 2)) + return 0; + if (sps->bit_depth_luma_minus8 !=3D sps->bit_depth_chroma_minus8) + return 0; + if (sps->chroma_format_idc !=3D 1) + return 0; + + /* Limits from H.265 7.4.3.2.1 */ + if (sps->log2_max_pic_order_cnt_lsb_minus4 > 12) + return 0; + if (sps->sps_max_dec_pic_buffering_minus1 > 15) + return 0; + if (sps->sps_max_num_reorder_pics > + sps->sps_max_dec_pic_buffering_minus1) + return 0; + if (ctb_log2_size_y > 6) + return 0; + if (max_tb_log2_size_y > 5) + return 0; + if (max_tb_log2_size_y > ctb_log2_size_y) + return 0; + if (sps->max_transform_hierarchy_depth_inter > + (ctb_log2_size_y - min_tb_log2_size_y)) + return 0; + if (sps->max_transform_hierarchy_depth_intra > + (ctb_log2_size_y - min_tb_log2_size_y)) + return 0; + /* Check pcm stuff */ + if (sps->num_short_term_ref_pic_sets > 64) + return 0; + if (sps->num_long_term_ref_pics_sps > 32) + return 0; + return 1; +} + +static u32 pixelformat_from_sps(const struct v4l2_ctrl_hevc_sps * const sp= s, + const int index) +{ + u32 pf =3D 0; + + if (!is_sps_set(sps) || !hevc_d_hevc_validate_sps(sps)) { + /* Treat this as an error? For now return both */ + if (index =3D=3D 0) + pf =3D V4L2_PIX_FMT_NV12MT_COL128; + else if (index =3D=3D 1) + pf =3D V4L2_PIX_FMT_NV12MT_10_COL128; + } else if (index =3D=3D 0) { + if (sps->bit_depth_luma_minus8 =3D=3D 0) + pf =3D V4L2_PIX_FMT_NV12MT_COL128; + else if (sps->bit_depth_luma_minus8 =3D=3D 2) + pf =3D V4L2_PIX_FMT_NV12MT_10_COL128; + } + + return pf; +} + +static void copy_color(struct v4l2_pix_format_mplane *d, + const struct v4l2_pix_format_mplane *s) +{ + d->colorspace =3D s->colorspace; + d->xfer_func =3D s->xfer_func; + d->ycbcr_enc =3D s->ycbcr_enc; + d->quantization =3D s->quantization; +} + +static struct v4l2_pix_format_mplane +hevc_d_hevc_default_dst_fmt(struct hevc_d_ctx * const ctx) +{ + const struct v4l2_ctrl_hevc_sps * const sps =3D + hevc_d_find_control_data(ctx, V4L2_CID_STATELESS_HEVC_SPS); + struct v4l2_pix_format_mplane pix_fmt; + + memset(&pix_fmt, 0, sizeof(pix_fmt)); + if (is_sps_set(sps)) { + pix_fmt.width =3D sps->pic_width_in_luma_samples; + pix_fmt.height =3D sps->pic_height_in_luma_samples; + pix_fmt.pixelformat =3D pixelformat_from_sps(sps, 0); + } + + hevc_d_prepare_dst_format(&pix_fmt); + copy_color(&pix_fmt, &ctx->src_fmt); + + return pix_fmt; +} + +static u32 hevc_d_hevc_get_dst_pixelformat(struct hevc_d_ctx * const ctx, + const int index) +{ + const struct v4l2_ctrl_hevc_sps * const sps =3D + hevc_d_find_control_data(ctx, V4L2_CID_STATELESS_HEVC_SPS); + + return pixelformat_from_sps(sps, index); +} + +static int hevc_d_enum_fmt_vid_cap(struct file *file, void *priv, + struct v4l2_fmtdesc *f) +{ + struct hevc_d_ctx * const ctx =3D hevc_d_file2ctx(file); + + const u32 pf =3D hevc_d_hevc_get_dst_pixelformat(ctx, f->index); + + if (pf =3D=3D 0) + return -EINVAL; + + f->pixelformat =3D pf; + return 0; +} + +/* + * get dst format - sets it to default if otherwise unset + * returns a pointer to the struct as a convienience + */ +static struct v4l2_pix_format_mplane *get_dst_fmt(struct hevc_d_ctx *const= ctx) +{ + if (!ctx->dst_fmt_set) + ctx->dst_fmt =3D hevc_d_hevc_default_dst_fmt(ctx); + return &ctx->dst_fmt; +} + +static int hevc_d_g_fmt_vid_cap(struct file *file, void *priv, + struct v4l2_format *f) +{ + struct hevc_d_ctx *ctx =3D hevc_d_file2ctx(file); + + f->fmt.pix_mp =3D *get_dst_fmt(ctx); + return 0; +} + +static int hevc_d_g_fmt_vid_out(struct file *file, void *priv, + struct v4l2_format *f) +{ + struct hevc_d_ctx *ctx =3D hevc_d_file2ctx(file); + + f->fmt.pix_mp =3D ctx->src_fmt; + return 0; +} + +static int hevc_d_try_fmt_vid_cap(struct file *file, void *priv, + struct v4l2_format *f) +{ + struct hevc_d_ctx *ctx =3D hevc_d_file2ctx(file); + const struct v4l2_ctrl_hevc_sps * const sps =3D + hevc_d_find_control_data(ctx, V4L2_CID_STATELESS_HEVC_SPS); + u32 pixelformat; + int i; + + for (i =3D 0; (pixelformat =3D pixelformat_from_sps(sps, i)) !=3D 0; i++)= { + if (f->fmt.pix_mp.pixelformat =3D=3D pixelformat) + break; + } + + /* + * We don't have any way of finding out colourspace so believe + * anything we are told - take anything set in src as a default + */ + if (f->fmt.pix_mp.colorspace =3D=3D V4L2_COLORSPACE_DEFAULT) + copy_color(&f->fmt.pix_mp, &ctx->src_fmt); + + f->fmt.pix_mp.pixelformat =3D pixelformat; + hevc_d_prepare_dst_format(&f->fmt.pix_mp); + return 0; +} + +static int hevc_d_try_fmt_vid_out(struct file *file, void *priv, + struct v4l2_format *f) +{ + hevc_d_prepare_src_format(&f->fmt.pix_mp); + return 0; +} + +static int hevc_d_s_fmt_vid_cap(struct file *file, void *priv, + struct v4l2_format *f) +{ + struct hevc_d_ctx *ctx =3D hevc_d_file2ctx(file); + struct vb2_queue *vq; + int ret; + + vq =3D v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type); + if (vb2_is_busy(vq)) + return -EBUSY; + + ret =3D hevc_d_try_fmt_vid_cap(file, priv, f); + if (ret) + return ret; + + ctx->dst_fmt =3D f->fmt.pix_mp; + ctx->dst_fmt_set =3D 1; + + return 0; +} + +static int hevc_d_s_fmt_vid_out(struct file *file, void *priv, + struct v4l2_format *f) +{ + struct hevc_d_ctx *ctx =3D hevc_d_file2ctx(file); + struct vb2_queue *vq; + int ret; + + vq =3D v4l2_m2m_get_vq(ctx->fh.m2m_ctx, f->type); + if (vb2_is_busy(vq)) + return -EBUSY; + + ret =3D hevc_d_try_fmt_vid_out(file, priv, f); + if (ret) + return ret; + + ctx->src_fmt =3D f->fmt.pix_mp; + ctx->dst_fmt_set =3D 0; /* Setting src invalidates dst */ + + /* Propagate colorspace information to capture. */ + copy_color(&ctx->dst_fmt, &f->fmt.pix_mp); + return 0; +} + +const struct v4l2_ioctl_ops hevc_d_ioctl_ops =3D { + .vidioc_querycap =3D hevc_d_querycap, + + .vidioc_enum_fmt_vid_cap =3D hevc_d_enum_fmt_vid_cap, + .vidioc_g_fmt_vid_cap_mplane =3D hevc_d_g_fmt_vid_cap, + .vidioc_try_fmt_vid_cap_mplane =3D hevc_d_try_fmt_vid_cap, + .vidioc_s_fmt_vid_cap_mplane =3D hevc_d_s_fmt_vid_cap, + + .vidioc_enum_fmt_vid_out =3D hevc_d_enum_fmt_vid_out, + .vidioc_g_fmt_vid_out_mplane =3D hevc_d_g_fmt_vid_out, + .vidioc_try_fmt_vid_out_mplane =3D hevc_d_try_fmt_vid_out, + .vidioc_s_fmt_vid_out_mplane =3D hevc_d_s_fmt_vid_out, + + .vidioc_reqbufs =3D v4l2_m2m_ioctl_reqbufs, + .vidioc_querybuf =3D v4l2_m2m_ioctl_querybuf, + .vidioc_qbuf =3D v4l2_m2m_ioctl_qbuf, + .vidioc_dqbuf =3D v4l2_m2m_ioctl_dqbuf, + .vidioc_prepare_buf =3D v4l2_m2m_ioctl_prepare_buf, + .vidioc_create_bufs =3D v4l2_m2m_ioctl_create_bufs, + .vidioc_expbuf =3D v4l2_m2m_ioctl_expbuf, + + .vidioc_streamon =3D v4l2_m2m_ioctl_streamon, + .vidioc_streamoff =3D v4l2_m2m_ioctl_streamoff, + + .vidioc_try_decoder_cmd =3D v4l2_m2m_ioctl_stateless_try_decoder_cmd, + .vidioc_decoder_cmd =3D v4l2_m2m_ioctl_stateless_decoder_cmd, + + .vidioc_subscribe_event =3D v4l2_ctrl_subscribe_event, + .vidioc_unsubscribe_event =3D v4l2_event_unsubscribe, +}; + +static int hevc_d_queue_setup(struct vb2_queue *vq, unsigned int *nbufs, + unsigned int *nplanes, unsigned int sizes[], + struct device *alloc_devs[]) +{ + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vq); + struct v4l2_pix_format_mplane *pix_fmt; + int expected_nplanes; + + if (V4L2_TYPE_IS_OUTPUT(vq->type)) { + pix_fmt =3D &ctx->src_fmt; + expected_nplanes =3D 1; + } else { + pix_fmt =3D get_dst_fmt(ctx); + expected_nplanes =3D 2; + } + + if (*nplanes) { + if (*nplanes !=3D expected_nplanes || + sizes[0] < pix_fmt->plane_fmt[0].sizeimage || + sizes[1] < pix_fmt->plane_fmt[1].sizeimage) + return -EINVAL; + } else { + sizes[0] =3D pix_fmt->plane_fmt[0].sizeimage; + if (V4L2_TYPE_IS_OUTPUT(vq->type)) { + *nplanes =3D 1; + } else { + sizes[1] =3D pix_fmt->plane_fmt[1].sizeimage; + *nplanes =3D 2; + } + } + + return 0; +} + +static void hevc_d_queue_cleanup(struct vb2_queue *vq, u32 state) +{ + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vq); + struct vb2_v4l2_buffer *vbuf; + + for (;;) { + if (V4L2_TYPE_IS_OUTPUT(vq->type)) + vbuf =3D v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx); + else + vbuf =3D v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx); + + if (!vbuf) + return; + + v4l2_ctrl_request_complete(vbuf->vb2_buf.req_obj.req, + &ctx->hdl); + v4l2_m2m_buf_done(vbuf, state); + } +} + +static int hevc_d_buf_out_validate(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf =3D to_vb2_v4l2_buffer(vb); + + vbuf->field =3D V4L2_FIELD_NONE; + return 0; +} + +static int hevc_d_buf_prepare(struct vb2_buffer *vb) +{ + struct vb2_queue *vq =3D vb->vb2_queue; + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vq); + struct v4l2_pix_format_mplane *pix_fmt; + + if (V4L2_TYPE_IS_OUTPUT(vq->type)) + pix_fmt =3D &ctx->src_fmt; + else + pix_fmt =3D &ctx->dst_fmt; + + if (vb2_plane_size(vb, 0) < pix_fmt->plane_fmt[0].sizeimage || + vb2_plane_size(vb, 1) < pix_fmt->plane_fmt[1].sizeimage) + return -EINVAL; + + vb2_set_plane_payload(vb, 0, pix_fmt->plane_fmt[0].sizeimage); + vb2_set_plane_payload(vb, 1, pix_fmt->plane_fmt[1].sizeimage); + + return 0; +} + +static int hevc_d_start_streaming(struct vb2_queue *vq, unsigned int count) +{ + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vq); + struct hevc_d_dev *dev =3D ctx->dev; + int ret =3D 0; + + v4l2_m2m_update_start_streaming_state(ctx->fh.m2m_ctx, vq); + + if (V4L2_TYPE_IS_OUTPUT(vq->type)) { + ret =3D hevc_d_hw_start_clock(dev); + if (ret) + goto fail_cleanup; + + ret =3D hevc_d_h265_start(ctx); + if (ret) + goto fail_stop_clock; + } + + return 0; + +fail_stop_clock: + hevc_d_hw_stop_clock(dev); +fail_cleanup: + v4l2_err(&dev->v4l2_dev, "%s: qtype=3D%d: FAIL\n", __func__, vq->type); + hevc_d_queue_cleanup(vq, VB2_BUF_STATE_QUEUED); + return ret; +} + +static void hevc_d_stop_streaming(struct vb2_queue *vq) +{ + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vq); + struct hevc_d_dev *dev =3D ctx->dev; + + if (V4L2_TYPE_IS_OUTPUT(vq->type)) { + hevc_d_h265_stop(ctx); + hevc_d_hw_stop_clock(dev); + } + + hevc_d_queue_cleanup(vq, VB2_BUF_STATE_ERROR); + + vb2_wait_for_all_buffers(vq); + + v4l2_m2m_update_stop_streaming_state(ctx->fh.m2m_ctx, vq); +} + +static void hevc_d_buf_queue(struct vb2_buffer *vb) +{ + struct vb2_v4l2_buffer *vbuf =3D to_vb2_v4l2_buffer(vb); + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vb->vb2_queue); + + v4l2_m2m_buf_queue(ctx->fh.m2m_ctx, vbuf); +} + +static void hevc_d_buf_request_complete(struct vb2_buffer *vb) +{ + struct hevc_d_ctx *ctx =3D vb2_get_drv_priv(vb->vb2_queue); + + v4l2_ctrl_request_complete(vb->req_obj.req, &ctx->hdl); +} + +static const struct vb2_ops hevc_d_qops =3D { + .queue_setup =3D hevc_d_queue_setup, + .buf_prepare =3D hevc_d_buf_prepare, + .buf_queue =3D hevc_d_buf_queue, + .buf_out_validate =3D hevc_d_buf_out_validate, + .buf_request_complete =3D hevc_d_buf_request_complete, + .start_streaming =3D hevc_d_start_streaming, + .stop_streaming =3D hevc_d_stop_streaming, +}; + +int hevc_d_queue_init(void *priv, struct vb2_queue *src_vq, + struct vb2_queue *dst_vq) +{ + struct hevc_d_ctx *ctx =3D priv; + int ret; + + src_vq->type =3D V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE; + src_vq->io_modes =3D VB2_MMAP | VB2_DMABUF; + src_vq->dma_attrs =3D DMA_ATTR_NO_KERNEL_MAPPING; + src_vq->drv_priv =3D ctx; + src_vq->buf_struct_size =3D sizeof(struct hevc_d_buffer); + src_vq->ops =3D &hevc_d_qops; + src_vq->mem_ops =3D &vb2_dma_contig_memops; + src_vq->timestamp_flags =3D V4L2_BUF_FLAG_TIMESTAMP_COPY; + src_vq->lock =3D &ctx->ctx_mutex; + src_vq->dev =3D ctx->dev->dev; + src_vq->supports_requests =3D true; + src_vq->requires_requests =3D true; + + ret =3D vb2_queue_init(src_vq); + if (ret) + return ret; + + dst_vq->type =3D V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; + dst_vq->io_modes =3D VB2_MMAP | VB2_DMABUF; + dst_vq->dma_attrs =3D DMA_ATTR_NO_KERNEL_MAPPING; + dst_vq->drv_priv =3D ctx; + dst_vq->buf_struct_size =3D sizeof(struct hevc_d_buffer); + dst_vq->min_queued_buffers =3D 1; + dst_vq->ops =3D &hevc_d_qops; + dst_vq->mem_ops =3D &vb2_dma_contig_memops; + dst_vq->timestamp_flags =3D V4L2_BUF_FLAG_TIMESTAMP_COPY; + dst_vq->lock =3D &ctx->ctx_mutex; + dst_vq->dev =3D ctx->dev->dev; + + return vb2_queue_init(dst_vq); +} diff --git a/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_video.h b/d= rivers/media/platform/raspberrypi/hevc_dec/hevc_d_video.h new file mode 100644 index 000000000000..88000ca51711 --- /dev/null +++ b/drivers/media/platform/raspberrypi/hevc_dec/hevc_d_video.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Raspberry Pi HEVC driver + * + * Copyright (C) 2026 Raspberry Pi Ltd + * + * Based on the Cedrus VPU driver, that is: + * + * Copyright (C) 2016 Florent Revest + * Copyright (C) 2018 Paul Kocialkowski + * Copyright (C) 2018 Bootlin + */ + +#ifndef _HEVC_D_VIDEO_H_ +#define _HEVC_D_VIDEO_H_ + +struct hevc_d_format { + u32 pixelformat; + u32 directions; + unsigned int capabilities; +}; + +static inline int is_sps_set(const struct v4l2_ctrl_hevc_sps * const sps) +{ + return sps && sps->pic_width_in_luma_samples; +} + +extern const struct v4l2_ioctl_ops hevc_d_ioctl_ops; + +int hevc_d_queue_init(void *priv, struct vb2_queue *src_vq, + struct vb2_queue *dst_vq); + +size_t hevc_d_bit_buf_size(unsigned int w, unsigned int h, unsigned int bi= ts_minus8); +size_t hevc_d_round_up_size(const size_t x); + +void hevc_d_prepare_src_format(struct v4l2_pix_format_mplane *pix_fmt); + +#endif --=20 2.34.1 From nobody Thu Apr 16 12:32:24 2026 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3125544B684 for ; Fri, 27 Feb 2026 17:20:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212855; cv=none; b=bAefXdgN328/JKeBqW5Z+VSVuDGV8EItLvh48K7bi2M+A1DqPbP1FM1CAp2o4sOG9mQBSpj+rLAo1RwOKpRIPz20YsIysajskB7sg3t/IrssbhLB18mqxXkQ3z7Vf55xix6ADBIH9mX+Y05KyjqhT4SgZUTp6wJHUyEp4pSkgTI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772212855; c=relaxed/simple; bh=GcWcCkh/S2QN6rJM8DB8/+xpSep1ONSspIJ5ZvkbTxA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OMLp4syt0Zs/m5BtSN9qi3DMUTMF5b0gKO/LznSNzA8GKSXsv5xyRFg3M6h5NDFFN6KO6HqUlu2vH1JVLQO5ezebKGsypTMBXMsXLMt87Brs8jnPM8AjvvbjCyg8juDOSJzW/2+ACAisXEx0K/q2RvT2+tUZ0csQV2vIE96ojz8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com; spf=pass smtp.mailfrom=raspberrypi.com; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b=fYQIOaz0; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raspberrypi.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=raspberrypi.com header.i=@raspberrypi.com header.b="fYQIOaz0" Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-48378136adcso13668525e9.1 for ; Fri, 27 Feb 2026 09:20:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raspberrypi.com; s=google; t=1772212849; x=1772817649; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=QzYqP/o/j5AHYZ8QSUfI4d0IlFeXxdlgP7OAfF3TxxQ=; b=fYQIOaz0uwEA4TFpIzwcDVEqzW+0swB5X2bYv2FahQBYoCCS7uS0y/i5M4/I8/6f89 G4MMgi7k7JB0t9N5OYaP3VRnHUXBMFX5Vpi6miWbYdlQ/CLRmjmmTk1HQqdbHW+cPIKY Cw0xTv+SdtLLQ9drf3L6AXnlQR4X243NpKibpV4DkJ6G9DxL5S89J3ljeTgl4Li1svwS of7zpHdpZNXJlOdTsplnfvK1eOhQ+1bkGC/umorZF658Vrb3fkZa6jd/N6C8BMCZ4mlm Q9r79jcfwezOci2g8U/XmoCaH2F87rqMsFZtAjY6brrAsuY/PdUWLFsTgN9Bh09W5yo9 EqYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772212849; x=1772817649; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=QzYqP/o/j5AHYZ8QSUfI4d0IlFeXxdlgP7OAfF3TxxQ=; b=arKeHDmd0odyXvHw/v78IsbWHsJob6avVT9SN6bxl+IMgQGboUBL6otRlWNHmO4q8C M5GtCnHI3qP/FyEI16xKHz19r8/vhA5mafkBGEG6z2dxRJ57t0pfngTsIpqi/AlfxbI+ 0dofA/gVgp0Vcdu0xmvI8dtZ/+NGAuWUlgADbkSJ/RZCDb9l0LqsMIqpfT2icXktUCk4 h4uxVJh15CXcp/WwKsltowCbW84ykPq77jKtZv6ep4gJ28EAWruemhGWTQKAfKeLGeJ1 b8B/k16qxLInlZLFKlb5asgxL7OsTB3NQNEXD6zAZNZPzgZwB6PtlTct+UoSuQW0+/MI WT6Q== X-Forwarded-Encrypted: i=1; AJvYcCWtHCgX6pnzgWf/113itPL7/oYeiZS+656qymqUtoZUbWztxwDBfqoBmQxNEokMLbLx3FlPqhHynzk/iC4=@vger.kernel.org X-Gm-Message-State: AOJu0YxukMlkxjzaYOgnkJgske/zSSIDrKNwIjaxCAhOFb9lO4LIVrnA yTrdY+HlJla3ONLgqPotUmT4IQDxbICHbEVPC9nAar2hELHbjWbFE4xAFNM2VmxMypE= X-Gm-Gg: ATEYQzyQlRhCxy0FnDPxmUThjI4RmuotgR9gMHzt5lT38L8HWmihP3tuqklyoqMRjrG XDtBy10mLIiUmBEy5zACkwtov97jTMX4GlIb8JcpvkJ1a6MD+EuxlDWT6AsEe6Cn0wLPHNzmXuD 9+/PLxaLueZnGsNDVYa1ZSdxXnKKmiGF5UF67ob7Kv+ZjKBTq9ZpulV5ay+MelK5w4BrwJutvmp kztytIBuQJTqyJzTEkF0iTN/u47D6lRs6tEcAQTeRY8NuwD2RjWH54flZ8+bGFdK/Pdh2xPeDEf v559EimJdcQAGbK2vgLCGo5lUlvGCdureeQMmP9MGArZSJ/yRS8oisO1bbkIeB4nF10w6T+48AS 62A/yOqIGypou9+Ilm7/QFlg807Bj6QvTrMuI7tf0+Xitr4RLTPK0OIN7US6EQBVxBJIySo3Wfa zjeb7rVbLrQQatcw== X-Received: by 2002:a05:600c:a16:b0:46e:4a13:e6c6 with SMTP id 5b1f17b1804b1-483c9bfb2f2mr54324715e9.19.1772212849364; Fri, 27 Feb 2026 09:20:49 -0800 (PST) Received: from [127.0.1.1] ([2a00:1098:3142:e::8]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-483bfeb932bsm60828075e9.28.2026.02.27.09.20.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Feb 2026 09:20:48 -0800 (PST) From: Dave Stevenson Date: Fri, 27 Feb 2026 17:19:11 +0000 Subject: [PATCH v5 6/6] arm: dts: bcm2711-rpi: Add HEVC decoder node Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260227-media-rpi-hevc-dec-v5-6-9bb3fc1816de@raspberrypi.com> References: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> In-Reply-To: <20260227-media-rpi-hevc-dec-v5-0-9bb3fc1816de@raspberrypi.com> To: Sakari Ailus , Laurent Pinchart , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Florian Fainelli , Broadcom internal kernel review list , John Cox , Dom Cobley , review list , Ezequiel Garcia Cc: Nicolas Dufresne , John Cox , Stefan Wahren , linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-rpi-kernel@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Dave Stevenson X-Mailer: b4 0.14.1 Add the configuration information for the HEVC decoder. Signed-off-by: Dave Stevenson --- arch/arm/boot/dts/broadcom/bcm2711-rpi.dtsi | 4 ++++ arch/arm/boot/dts/broadcom/bcm2711.dtsi | 9 +++++++++ 2 files changed, 13 insertions(+) diff --git a/arch/arm/boot/dts/broadcom/bcm2711-rpi.dtsi b/arch/arm/boot/dt= s/broadcom/bcm2711-rpi.dtsi index 1eb6406449d1..aef5ff7b2a53 100644 --- a/arch/arm/boot/dts/broadcom/bcm2711-rpi.dtsi +++ b/arch/arm/boot/dts/broadcom/bcm2711-rpi.dtsi @@ -68,6 +68,10 @@ &hdmi1 { wifi-2.4ghz-coexistence; }; =20 +&hevc_dec { + clocks =3D <&firmware_clocks 11>; +}; + &hvs { clocks =3D <&firmware_clocks 4>; }; diff --git a/arch/arm/boot/dts/broadcom/bcm2711.dtsi b/arch/arm/boot/dts/br= oadcom/bcm2711.dtsi index 5e3b4bb39396..7b2081ef0413 100644 --- a/arch/arm/boot/dts/broadcom/bcm2711.dtsi +++ b/arch/arm/boot/dts/broadcom/bcm2711.dtsi @@ -617,6 +617,15 @@ xhci: usb@7e9c0000 { status =3D "disabled"; }; =20 + hevc_dec: codec@7eb00000 { + compatible =3D "brcm,bcm2711-hevc-dec"; + reg =3D <0x0 0x7eb00000 0x10000>, + <0x0 0x7eb10000 0x1000>; + reg-names =3D "hevc", + "intc"; + interrupts =3D ; + }; + v3d: gpu@7ec00000 { compatible =3D "brcm,2711-v3d"; reg =3D <0x0 0x7ec00000 0x4000>, --=20 2.34.1