From nobody Sun Dec 14 11:17:51 2025 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25D7F333459 for ; Thu, 30 Oct 2025 07:00:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761807639; cv=none; b=V+gAf8KpYnTgR+FUPUzWIat1puIykEo+zkUnwRgPEOLwcb8rz9QY8U3pjXy7xMkE5D8DXBOfwGaEDkEPMKZnbwZ+ocGLLHeoeRAcZfUo+612+G6XFxFCTdshFHcwVR9sounEEECZcZsjfb9ZIREIsSiwxhAhvZLeQTgOULwLm+o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761807639; c=relaxed/simple; bh=EV70AG/SahcvuoNNWz/iNen1RU6kgUqCH6OFovTR3Xw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=idXpUlDAlgXHTjE9da+f72vJ5wc5p0V9FBQ7Ol0X8uSGugerdX4Gr2FYhjyBMNTZvdkw9VoL3x322qKZ+J4m9ACHf9EwjOdQyzW6pIWERI4ehMJ8ULQhp2lJRW6gloJiC/aj02Kd3S3xxElvfltLbDBkCQv4sccZmmOO52wQSU4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com; spf=pass smtp.mailfrom=oss.qualcomm.com; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b=SRUa3O7K; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b=gYProcDU; arc=none smtp.client-ip=205.220.168.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b="SRUa3O7K"; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="gYProcDU" Received: from pps.filterd (m0279866.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 59TKVF1o1501736 for ; Thu, 30 Oct 2025 07:00:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= 7tstVNuEdI/38uZIzwou4d/87QfQRN4dh0p57FqW/g4=; b=SRUa3O7K6XSa2hvy BLpaBP6BVlltxfaXVdA4xsAr693uGjYdpOtq8bBht8ECqFCf0veVkLapXVuQACR6 qb5ekaJ6z8nF8msbtaBONNn7PktL32r2g2aOO95S/Lnh85bingrEqV7alaTY5kxk VUGHBhCwWPYRTDX1pAendsf+Q2RxbYyHcWjS2QPougbN1z7IfFR7oBXIFPxcH5L2 DBsqp4FhukLlYMqUG0K8o/gPq/CB1crHLMv928gALn7ryiFE/R/dsozeim1o7ODM Kh4+/KyhS/DGr6hWzP8CgPW136JN0wmOwd9eSXzff1hVE1HRM1sEKfPhhIM53yOy nGpPlA== Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4a3t1jsgrb-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 30 Oct 2025 07:00:33 +0000 (GMT) Received: by mail-pl1-f199.google.com with SMTP id d9443c01a7336-294fb15cc98so83525ad.3 for ; Thu, 30 Oct 2025 00:00:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1761807633; x=1762412433; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=7tstVNuEdI/38uZIzwou4d/87QfQRN4dh0p57FqW/g4=; b=gYProcDUMe3Xijwp7kXbUrxGJ9s3g6HWOk1/gI7Gjj3ZhbWakTmR/efFab7NarPmbV MHIXkWBIjop7FJfvvTBs9ZMq+9HmUzaA4t7i/JM22+7s4CQ/CoBAWGNlpRUUtrXKElEQ QbY4G5PejnKIWX4i0ZKrRI7dpt8mpzQuqQVvZzzObDfX+he8iYF/UcBTswdb4Biruxbb OIIt7MXMlmaANCWo0I2Ou9DuFMQpUjlJ8nzGPuq5+MOMiLlHLgH2XWVYBE9c1Cxlszw5 Kh4lN/610R+mg/lME1pSgZEdUFtjfieYiasWPtXT0qUAKOTnqIec3b57QolxDrPhzH1m PjnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761807633; x=1762412433; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7tstVNuEdI/38uZIzwou4d/87QfQRN4dh0p57FqW/g4=; b=GKMmeqEowr8qH8wLYQAMCeVcH6im3qYOOUDLSeBiAUodinWs70cMyBkDzVVHP2sUQs MXYfvX6Z+pC8RdfXE9C32XDzSwJGlHBouKfD6zesRIwM32ja4VDUWw3/97l4/gRVDWWr rdvTepfe18m5e5U/xufLHDRh7gxOX2Mxr0euI4d6EE4INJ4OsJnzwvzc2EKutIAuNsbQ jvKw2yBP0Vbnp8NPKf5+xmcHcSOoPsIoOWuvq0bQuJ397t0+xa4JdCktAjxRXCvzDA/0 QWJXT7yAt0andHwWPjS6zYpLqjosfID7nxT7EfdRst+y0AniY+/FiPqHBiiEgqz44U+f LILw== X-Forwarded-Encrypted: i=1; AJvYcCXdPJ8jWLfMJsl/4mzXDt9j6iMQS/wtbb7e+h8BgtOml1DoOTgOBzPsmnAJHJNq1wPPgd7Ec5WSuSSTocs=@vger.kernel.org X-Gm-Message-State: AOJu0Yw+heic4PlNICMNBm98z1lFCg6PTmH+nkABndMmwt7X4nWsLUZX lfjvYTfFAtmT1jiX2opdtRvimNfyBPxYcRkA/2XVjVLmUmT66sCgqih2X2jVE8cm+gDCpcnla2Y qdIPgnoOaQr1aR+LTVPkJPu93PNzt7jHxLacakaz+bskViU91DyVwFV4UMn5iqDEqpIY= X-Gm-Gg: ASbGncvZ14zLJymlY3xOo0OB9eBSqtqNgWknB0+kGKy5JCn3Du5r8dYi+Q0LZ9PbAxG /QkXEf8k/focGQejQ0/uDoThlMUMUIPs2H6gYKuT7s8r1SQWoAIHDF/xIE8mWKsXqLqUgTMehbi w2crDCQp4ZZgNYTYhAcDCTLjObAOBRnV3RR16EYlKU+tK4Oxafh6QLzYffhlh5q8QcjyCj71GMy R2SQqagdvK3BwSz4RI3GAoNBh9L/EpyppjH00rhW3i5F660XZ30hIOjp5fvmGRSVQmMiGSqPAUV Bofv5czbAaq6kIE1dlleFiDjm9mZmUjYCO6Zu9FlJVliWHkPlf9iifi+Wfu+op+sxEkLh2Mhd2v 41bQ6cShgNNm0KMIhhB12LPlyDat6OzOO0FW8Fl8/9TZXB4MyMNbEXK0GDXY= X-Received: by 2002:a17:903:3846:b0:26c:3c15:f780 with SMTP id d9443c01a7336-294def34564mr32552935ad.8.1761807629781; Thu, 30 Oct 2025 00:00:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEE9JYyUe4VaqGvLM5E9o64DJISLx8yFJTt9g5j7WbHv0oNqsCRIHxo8wjLLhdc7rULeLVYJA== X-Received: by 2002:a17:903:3846:b0:26c:3c15:f780 with SMTP id d9443c01a7336-294def34564mr32552605ad.8.1761807629176; Thu, 30 Oct 2025 00:00:29 -0700 (PDT) Received: from gu-dmadival-lv.qualcomm.com (Global_NAT1.qualcomm.com. [129.46.96.20]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-29498d40a73sm177671325ad.74.2025.10.30.00.00.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Oct 2025 00:00:28 -0700 (PDT) From: Deepa Guthyappa Madivalara Date: Thu, 30 Oct 2025 00:00:10 -0700 Subject: [PATCH v3 5/5] media: iris: Add internal buffer calculation for AV1 decoder Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251030-av1d_stateful_v3-v3-5-a1184de52fc4@oss.qualcomm.com> References: <20251030-av1d_stateful_v3-v3-0-a1184de52fc4@oss.qualcomm.com> In-Reply-To: <20251030-av1d_stateful_v3-v3-0-a1184de52fc4@oss.qualcomm.com> To: Mauro Carvalho Chehab , Vikash Garodia , Dikshita Agarwal , Abhinav Kumar , Bryan O'Donoghue Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, kernel test robot , Deepa Guthyappa Madivalara X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1761807623; l=23174; i=deepa.madivalara@oss.qualcomm.com; s=20250814; h=from:subject:message-id; bh=EV70AG/SahcvuoNNWz/iNen1RU6kgUqCH6OFovTR3Xw=; b=XQqZ9k0BPJc0k/g6fFT7IR3/wnBgYtB+VG/fWyN04rD3PP1IlgHsmsmsX+YeoiE6s1Wh2K73M wJU00uJIy00BA0ACoDn4/ncRu4h27FWw8fqeC2tQVMc/LSNdku2GMcI X-Developer-Key: i=deepa.madivalara@oss.qualcomm.com; a=ed25519; pk=MOEXgyokievn+bgpHdS6Ixh/KQYyS90z2mqIbQ822FQ= X-Proofpoint-GUID: nQFiN4pGRIxVz__ZUrYCOxDkxxBekAD7 X-Proofpoint-ORIG-GUID: nQFiN4pGRIxVz__ZUrYCOxDkxxBekAD7 X-Authority-Analysis: v=2.4 cv=M/lA6iws c=1 sm=1 tr=0 ts=69030d11 cx=c_pps a=JL+w9abYAAE89/QcEU+0QA==:117 a=ouPCqIW2jiPt+lZRy3xVPw==:17 a=IkcTkHD0fZMA:10 a=x6icFKpwvdMA:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=EUspDBNiAAAA:8 a=O9RjQ3k1k3WzeL8qxWgA:9 a=QEXdDO2ut3YA:10 a=324X-CrmTo6CU4MGRt3R:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMDMwMDA1NSBTYWx0ZWRfX3F4XVkdUpLu+ YHEWbssmkA3jxiZL7uGTUWs8zzseNokX58NOuCYH5+bZ9LyoczFsBku9N4fwFPO1DPiayklcMa7 3TFCBDAxyBrBnBF6FCo+Uf/A0W5ZXpnePhn8lnxzX1YD/1Kz4mooNwtTb/81p8+hIUJ+qh8eo6c 7lFw+rBCw3ELTSXHDzCqAoQlAzblIDOuFjxXMkusSZHZfvKMbIp7lyov8+itmfqgaWFbpnJm7pj N9CAUkbC5whrCffDqyso9viz2+i8TQSoOFKFt3U70S1XRUudpH1rc2qs4fYbhwYpuiXdzA9N7VR oxHzSfVcq7TuRZGXWgstDPWH36Uy41gWW3FnJxvf19qwKb95sevDccTB0Upub4VWNiO9zb6Po4R sjsxWoWf6cNtuQV4lmXWl/NaCO0lwA== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2025-10-30_01,2025-10-29_03,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 impostorscore=0 suspectscore=0 priorityscore=1501 lowpriorityscore=0 malwarescore=0 spamscore=0 clxscore=1015 adultscore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2510240001 definitions=main-2510300055 Implement internal buffer count and size calculations for AV1 decoder for all the buffer types required by the AV1 decoder, including BIN, COMV, PERSIST, LINE, and PARTIAL. This ensures the hardware decoder has properly allocated memory for AV1 decoding operations, enabling correct AV1 video playback. Signed-off-by: Deepa Guthyappa Madivalara Reviewed-by: Dikshita Agarwal --- drivers/media/platform/qcom/iris/iris_buffer.h | 1 + drivers/media/platform/qcom/iris/iris_vpu_buffer.c | 297 +++++++++++++++++= +++- drivers/media/platform/qcom/iris/iris_vpu_buffer.h | 116 ++++++++ 3 files changed, 411 insertions(+), 3 deletions(-) diff --git a/drivers/media/platform/qcom/iris/iris_buffer.h b/drivers/media= /platform/qcom/iris/iris_buffer.h index 5ef365d9236c7cbdee24a4614789b3191881968b..75bb767761824c4c02e0df9b765= 896cc093be333 100644 --- a/drivers/media/platform/qcom/iris/iris_buffer.h +++ b/drivers/media/platform/qcom/iris/iris_buffer.h @@ -27,6 +27,7 @@ struct iris_inst; * @BUF_SCRATCH_1: buffer to store decoding/encoding context data for HW * @BUF_SCRATCH_2: buffer to store encoding context data for HW * @BUF_VPSS: buffer to store VPSS context data for HW + * @BUF_PARTIAL: buffer for AV1 IBC data * @BUF_TYPE_MAX: max buffer types */ enum iris_buffer_type { diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c b/drivers/m= edia/platform/qcom/iris/iris_vpu_buffer.c index 4463be05ce165adef6b152eb0c155d2e6a7b3c36..694f431cca98af945ac8afee41a= 0c2de6edac3ef 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c @@ -9,6 +9,17 @@ #include "iris_hfi_gen2_defines.h" =20 #define HFI_MAX_COL_FRAME 6 +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_HEIGHT (8) +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH (32) +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_HEIGHT (8) +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_WIDTH (16) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_HEIGHT (4) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH (48) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_HEIGHT (4) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_WIDTH (24) +#define AV1D_SIZE_BSE_COL_MV_64x64 512 +#define AV1D_SIZE_BSE_COL_MV_128x128 2816 + =20 #ifndef SYSTEM_LAL_TILE10 #define SYSTEM_LAL_TILE10 192 @@ -39,6 +50,31 @@ static u32 hfi_buffer_bin_h264d(u32 frame_width, u32 fra= me_height, u32 num_vpp_p return size_h264d_hw_bin_buffer(n_aligned_w, n_aligned_h, num_vpp_pipes); } =20 +static u32 size_av1d_hw_bin_buffer(u32 frame_width, u32 frame_height, u32 = num_vpp_pipes) +{ + u32 size_yuv, size_bin_hdr, size_bin_res; + + size_yuv =3D ((frame_width * frame_height) <=3D BIN_BUFFER_THRESHOLD) ? + ((BIN_BUFFER_THRESHOLD * 3) >> 1) : + ((frame_width * frame_height * 3) >> 1); + size_bin_hdr =3D size_yuv * AV1_CABAC_HDR_RATIO_HD_TOT; + size_bin_res =3D size_yuv * AV1_CABAC_RES_RATIO_HD_TOT; + size_bin_hdr =3D ALIGN(size_bin_hdr / num_vpp_pipes, + DMA_ALIGNMENT) * num_vpp_pipes; + size_bin_res =3D ALIGN(size_bin_res / num_vpp_pipes, + DMA_ALIGNMENT) * num_vpp_pipes; + + return size_bin_hdr + size_bin_res; +} + +static u32 hfi_buffer_bin_av1d(u32 frame_width, u32 frame_height, u32 num_= vpp_pipes) +{ + u32 n_aligned_h =3D ALIGN(frame_height, 16); + u32 n_aligned_w =3D ALIGN(frame_width, 16); + + return size_av1d_hw_bin_buffer(n_aligned_w, n_aligned_h, num_vpp_pipes); +} + static u32 size_h265d_hw_bin_buffer(u32 frame_width, u32 frame_height, u32= num_vpp_pipes) { u32 product =3D frame_width * frame_height; @@ -110,6 +146,26 @@ static u32 hfi_buffer_comv_h265d(u32 frame_width, u32 = frame_height, u32 _comv_bu return (_size * (_comv_bufcount)) + 512; } =20 +static u32 num_lcu(u32 frame_width, u32 frame_height, u32 lcu_size) +{ + return ((frame_width + lcu_size - 1) / lcu_size) * + ((frame_height + lcu_size - 1) / lcu_size); +} + +static u32 hfi_buffer_comv_av1d(u32 frame_width, u32 frame_height, u32 com= v_bufcount) +{ + u32 size; + + size =3D 2 * ALIGN(max(num_lcu(frame_width, frame_height, 64) * + AV1D_SIZE_BSE_COL_MV_64x64, + num_lcu(frame_width, frame_height, 128) * + AV1D_SIZE_BSE_COL_MV_128x128), + DMA_ALIGNMENT); + size *=3D comv_bufcount; + + return size; +} + static u32 size_h264d_bse_cmd_buf(u32 frame_height) { u32 height =3D ALIGN(frame_height, 32); @@ -174,6 +230,20 @@ static u32 hfi_buffer_persist_h264d(void) DMA_ALIGNMENT); } =20 +static u32 hfi_buffer_persist_av1d(u32 max_width, u32 max_height, u32 tota= l_ref_count) +{ + u32 comv_size, size; + + comv_size =3D hfi_buffer_comv_av1d(max_width, max_height, total_ref_coun= t); + size =3D ALIGN((SIZE_AV1D_SEQUENCE_HEADER * 2 + SIZE_AV1D_METADATA + + AV1D_NUM_HW_PIC_BUF * (SIZE_AV1D_TILE_OFFSET + SIZE_AV1D_QM) + + AV1D_NUM_FRAME_HEADERS * (SIZE_AV1D_FRAME_HEADER + + 2 * SIZE_AV1D_PROB_TABLE) + comv_size + HDR10_HIST_EXTRADATA_SIZE + + SIZE_AV1D_METADATA * AV1D_NUM_HW_PIC_BUF), DMA_ALIGNMENT); + + return ALIGN(size, DMA_ALIGNMENT); +} + static u32 hfi_buffer_non_comv_h264d(u32 frame_width, u32 frame_height, u3= 2 num_vpp_pipes) { u32 size_bse =3D size_h264d_bse_cmd_buf(frame_height); @@ -459,6 +529,182 @@ static u32 hfi_buffer_line_h264d(u32 frame_width, u32= frame_height, return ALIGN((size + vpss_lb_size), DMA_ALIGNMENT); } =20 +static u32 size_av1d_lb_opb_wr1_nv12_ubwc(u32 frame_width, u32 frame_heigh= t) +{ + u32 size, y_width, y_width_a =3D 128; + + y_width =3D ALIGN(frame_width, y_width_a); + + size =3D ((y_width + HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH - 1) / + HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH + + (AV1D_MAX_TILE_COLS - 1)); + return size * 256; +} + +static u32 size_av1d_lb_opb_wr1_tp10_ubwc(u32 frame_width, u32 frame_heigh= t) +{ + u32 size, y_width, y_width_a =3D 256; + + y_width =3D ALIGN(frame_width, y_width_a); + + size =3D ((y_width + HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH - 1) / + HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH + + (AV1D_MAX_TILE_COLS - 1)); + + return size * 256; +} + +static u32 hfi_buffer_line_av1d(u32 frame_width, u32 frame_height, + bool is_opb, u32 num_vpp_pipes) +{ + u32 size, vpss_lb_size, opbwrbufsize, opbwr8, opbwr10; + + size =3D ALIGN(size_av1d_lb_fe_top_data(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_fe_top_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_fe_left_data(frame_width, frame_height), + DMA_ALIGNMENT) * num_vpp_pipes + + ALIGN(size_av1d_lb_fe_left_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) * num_vpp_pipes + + ALIGN(size_av1d_lb_se_left_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) * num_vpp_pipes + + ALIGN(size_av1d_lb_se_top_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_pe_top_data(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_vsp_top(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_recon_dma_metadata_wr + (frame_width, frame_height), DMA_ALIGNMENT) * 2 + + ALIGN(size_av1d_qp(frame_width, frame_height), DMA_ALIGNMENT); + opbwr8 =3D size_av1d_lb_opb_wr1_nv12_ubwc(frame_width, frame_height); + opbwr10 =3D size_av1d_lb_opb_wr1_tp10_ubwc(frame_width, frame_height); + opbwrbufsize =3D opbwr8 >=3D opbwr10 ? opbwr8 : opbwr10; + size =3D ALIGN((size + opbwrbufsize), DMA_ALIGNMENT); + if (is_opb) { + vpss_lb_size =3D size_vpss_lb(frame_width, frame_height); + size =3D ALIGN((size + vpss_lb_size) * 2, DMA_ALIGNMENT); + } + + return size; +} + +static u32 size_av1d_ibc_nv12_ubwc(u32 frame_width, u32 frame_height) +{ + u32 size; + u32 y_width_a =3D 128, y_height_a =3D 32; + u32 uv_width_a =3D 128, uv_height_a =3D 32; + u32 ybufsize, uvbufsize, y_width, y_height, uv_width, uv_height; + u32 y_meta_width_a =3D 64, y_meta_height_a =3D 16; + u32 uv_meta_width_a =3D 64, uv_meta_height_a =3D 16; + u32 meta_height, meta_stride, meta_size; + u32 tile_width_y =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH; + u32 tile_height_y =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_HEIGHT; + u32 tile_width_uv =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_WIDTH; + u32 tile_height_uv =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_HEIGHT; + + y_width =3D ALIGN(frame_width, y_width_a); + y_height =3D ALIGN(frame_height, y_height_a); + uv_width =3D ALIGN(frame_width, uv_width_a); + uv_height =3D ALIGN(((frame_height + 1) >> 1), uv_height_a); + ybufsize =3D ALIGN((y_width * y_height), HFI_ALIGNMENT_4096); + uvbufsize =3D ALIGN(uv_width * uv_height, HFI_ALIGNMENT_4096); + size =3D ybufsize + uvbufsize; + meta_stride =3D ALIGN(((frame_width + (tile_width_y - 1)) / tile_width_y), + y_meta_width_a); + meta_height =3D ALIGN(((frame_height + (tile_height_y - 1)) / tile_height= _y), + y_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + meta_stride =3D ALIGN(((((frame_width + 1) >> 1) + (tile_width_uv - 1)) / + tile_width_uv), uv_meta_width_a); + meta_height =3D ALIGN(((((frame_height + 1) >> 1) + (tile_height_uv - 1))= / + tile_height_uv), uv_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + + return size; +} + +static u32 hfi_yuv420_tp10_calc_y_stride(u32 frame_width, u32 stride_multi= ple) +{ + u32 stride; + + stride =3D ALIGN(frame_width, 192); + stride =3D ALIGN(stride * 4 / 3, stride_multiple); + + return stride; +} + +static u32 hfi_yuv420_tp10_calc_y_bufheight(u32 frame_height, u32 min_buf_= height_multiple) +{ + return ALIGN(frame_height, min_buf_height_multiple); +} + +static u32 hfi_yuv420_tp10_calc_uv_stride(u32 frame_width, u32 stride_mult= iple) +{ + u32 stride; + + stride =3D ALIGN(frame_width, 192); + stride =3D ALIGN(stride * 4 / 3, stride_multiple); + + return stride; +} + +static u32 hfi_yuv420_tp10_calc_uv_bufheight(u32 frame_height, u32 min_buf= _height_multiple) +{ + return ALIGN(((frame_height + 1) >> 1), min_buf_height_multiple); +} + +static u32 size_av1d_ibc_tp10_ubwc(u32 frame_width, u32 frame_height) +{ + u32 size; + u32 y_width_a =3D 256, y_height_a =3D 16, + uv_width_a =3D 256, uv_height_a =3D 16; + u32 ybufsize, uvbufsize, y_width, y_height, uv_width, uv_height; + u32 y_meta_width_a =3D 64, y_meta_height_a =3D 16, + uv_meta_width_a =3D 64, uv_meta_height_a =3D 16; + u32 meta_height, meta_stride, meta_size; + u32 tile_width_y =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH; + u32 tile_height_y =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_HEIGHT; + u32 tile_width_uv =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_WIDTH; + u32 tile_height_uv =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_HEIGHT; + + y_width =3D hfi_yuv420_tp10_calc_y_stride(frame_width, y_width_a); + y_height =3D hfi_yuv420_tp10_calc_y_bufheight(frame_height, y_height_a); + uv_width =3D hfi_yuv420_tp10_calc_uv_stride(frame_width, uv_width_a); + uv_height =3D hfi_yuv420_tp10_calc_uv_bufheight(frame_height, uv_height_a= ); + ybufsize =3D ALIGN(y_width * y_height, HFI_ALIGNMENT_4096); + uvbufsize =3D ALIGN(uv_width * uv_height, HFI_ALIGNMENT_4096); + size =3D ybufsize + uvbufsize; + meta_stride =3D ALIGN(((frame_width + (tile_width_y - 1)) / tile_width_y), + y_meta_width_a); + meta_height =3D ALIGN(((frame_height + (tile_height_y - 1)) / tile_height= _y), + y_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + meta_stride =3D ALIGN(((((frame_width + 1) >> 1) + (tile_width_uv - 1)) / + tile_width_uv), uv_meta_width_a); + meta_height =3D ALIGN(((((frame_height + 1) >> 1) + (tile_height_uv - 1))= / + tile_height_uv), uv_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + + return size; +} + +static u32 hfi_buffer_ibc_av1d(u32 frame_width, u32 frame_height) +{ + u32 size, ibc8, ibc10; + + ibc8 =3D size_av1d_ibc_nv12_ubwc(frame_width, frame_height); + ibc10 =3D size_av1d_ibc_tp10_ubwc(frame_width, frame_height); + size =3D ibc8 >=3D ibc10 ? ibc8 : ibc10; + + return ALIGN(size, DMA_ALIGNMENT); +} + static u32 iris_vpu_dec_bin_size(struct iris_inst *inst) { u32 num_vpp_pipes =3D inst->core->iris_platform_data->num_vpp_pipe; @@ -472,6 +718,8 @@ static u32 iris_vpu_dec_bin_size(struct iris_inst *inst) return hfi_buffer_bin_h265d(width, height, num_vpp_pipes); else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) return hfi_buffer_bin_vp9d(width, height, num_vpp_pipes); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return hfi_buffer_bin_av1d(width, height, num_vpp_pipes); =20 return 0; } @@ -487,18 +735,34 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *i= nst) return hfi_buffer_comv_h264d(width, height, num_comv); else if (inst->codec =3D=3D V4L2_PIX_FMT_HEVC) return hfi_buffer_comv_h265d(width, height, num_comv); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) { + if (inst->fw_caps[DRAP].value) + return 0; + else + return hfi_buffer_comv_av1d(width, height, num_comv); + } =20 return 0; } =20 static u32 iris_vpu_dec_persist_size(struct iris_inst *inst) { + struct platform_inst_caps *caps; + if (inst->codec =3D=3D V4L2_PIX_FMT_H264) return hfi_buffer_persist_h264d(); else if (inst->codec =3D=3D V4L2_PIX_FMT_HEVC) return hfi_buffer_persist_h265d(0); else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) return hfi_buffer_persist_vp9d(); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) { + caps =3D inst->core->iris_platform_data->inst_caps; + if (inst->fw_caps[DRAP].value) + return hfi_buffer_persist_av1d(caps->max_frame_width, + caps->max_frame_height, 16); + else + return hfi_buffer_persist_av1d(0, 0, 0); + } =20 return 0; } @@ -545,6 +809,8 @@ static u32 iris_vpu_dec_line_size(struct iris_inst *ins= t) else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) return hfi_buffer_line_vp9d(width, height, out_min_count, is_opb, num_vpp_pipes); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return hfi_buffer_line_av1d(width, height, is_opb, num_vpp_pipes); =20 return 0; } @@ -653,6 +919,15 @@ static u32 iris_vpu_enc_bin_size(struct iris_inst *ins= t) num_vpp_pipes, inst->hfi_rc_type); } =20 +static u32 iris_vpu_dec_partial_size(struct iris_inst *inst) +{ + struct v4l2_format *f =3D inst->fmt_src; + u32 height =3D f->fmt.pix_mp.height; + u32 width =3D f->fmt.pix_mp.width; + + return hfi_buffer_ibc_av1d(width, height); +} + static inline u32 hfi_buffer_comv_enc(u32 frame_width, u32 frame_height, u32 lcu_size, u32 num_recon, u32 standard) @@ -1414,7 +1689,9 @@ static int output_min_count(struct iris_inst *inst) =20 /* fw_min_count > 0 indicates reconfig event has already arrived */ if (inst->fw_min_count) { - if (iris_split_mode_enabled(inst) && inst->codec =3D=3D V4L2_PIX_FMT_VP9) + if (iris_split_mode_enabled(inst) && + (inst->codec =3D=3D V4L2_PIX_FMT_VP9 || + inst->codec =3D=3D V4L2_PIX_FMT_AV1)) return min_t(u32, 4, inst->fw_min_count); else return inst->fw_min_count; @@ -1422,6 +1699,8 @@ static int output_min_count(struct iris_inst *inst) =20 if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) output_min_count =3D 9; + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + output_min_count =3D 11; =20 return output_min_count; } @@ -1444,6 +1723,7 @@ u32 iris_vpu_buf_size(struct iris_inst *inst, enum ir= is_buffer_type buffer_type) {BUF_PERSIST, iris_vpu_dec_persist_size }, {BUF_DPB, iris_vpu_dec_dpb_size }, {BUF_SCRATCH_1, iris_vpu_dec_scratch1_size }, + {BUF_PARTIAL, iris_vpu_dec_partial_size }, }; =20 static const struct iris_vpu_buf_type_handle enc_internal_buf_type_handle= [] =3D { @@ -1510,14 +1790,20 @@ static u32 internal_buffer_count(struct iris_inst *= inst, buffer_type =3D=3D BUF_PERSIST) { return 1; } else if (buffer_type =3D=3D BUF_COMV || buffer_type =3D=3D BUF_NON_COMV= ) { - if (inst->codec =3D=3D V4L2_PIX_FMT_H264 || inst->codec =3D=3D V4L2_PIX_= FMT_HEVC) + if (inst->codec =3D=3D V4L2_PIX_FMT_H264 || + inst->codec =3D=3D V4L2_PIX_FMT_HEVC || + inst->codec =3D=3D V4L2_PIX_FMT_AV1) return 1; } + return 0; } =20 static inline int iris_vpu_dpb_count(struct iris_inst *inst) { + if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return 11; + if (iris_split_mode_enabled(inst)) { return inst->fw_min_count ? inst->fw_min_count : inst->buffers[BUF_OUTPUT].min_count; @@ -1536,9 +1822,13 @@ int iris_vpu_buf_count(struct iris_inst *inst, enum = iris_buffer_type buffer_type return MIN_BUFFERS; else return output_min_count(inst); + case BUF_NON_COMV: + if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return 0; + else + return 1; case BUF_BIN: case BUF_COMV: - case BUF_NON_COMV: case BUF_LINE: case BUF_PERSIST: return internal_buffer_count(inst, buffer_type); @@ -1546,6 +1836,7 @@ int iris_vpu_buf_count(struct iris_inst *inst, enum i= ris_buffer_type buffer_type case BUF_SCRATCH_2: case BUF_VPSS: case BUF_ARP: + case BUF_PARTIAL: return 1; /* internal buffer count needed by firmware is 1 */ case BUF_DPB: return iris_vpu_dpb_count(inst); diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/drivers/m= edia/platform/qcom/iris/iris_vpu_buffer.h index 04f0b7400a1e4e1d274d690a2761b9e57778e8b7..a8ffc1286260bddbf49df18fba0= a23f78056a484 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h @@ -11,6 +11,7 @@ struct iris_inst; #define MIN_BUFFERS 4 =20 #define DMA_ALIGNMENT 256 +#define HFI_ALIGNMENT_4096 4096 =20 #define NUM_HW_PIC_BUF 32 #define LCU_MAX_SIZE_PELS 64 @@ -81,6 +82,22 @@ struct iris_inst; #define MAX_PE_NBR_DATA_LCU64_LINE_BUFFER_SIZE 384 #define MAX_FE_NBR_DATA_LUMA_LINE_BUFFER_SIZE 640 =20 +#define AV1_CABAC_HDR_RATIO_HD_TOT 2 +#define AV1_CABAC_RES_RATIO_HD_TOT 2 +#define AV1D_LCU_MAX_SIZE_PELS 128 +#define AV1D_LCU_MIN_SIZE_PELS 64 +#define AV1D_MAX_TILE_COLS 64 +#define MAX_PE_NBR_DATA_LCU32_LINE_BUFFER_SIZE 192 +#define MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE 96 +#define AV1D_NUM_HW_PIC_BUF 16 +#define AV1D_NUM_FRAME_HEADERS 16 +#define SIZE_AV1D_SEQUENCE_HEADER 768 +#define SIZE_AV1D_METADATA 512 +#define SIZE_AV1D_FRAME_HEADER 1280 +#define SIZE_AV1D_TILE_OFFSET 65536 +#define SIZE_AV1D_QM 3328 +#define SIZE_AV1D_PROB_TABLE 22784 + #define SIZE_SLICE_CMD_BUFFER (ALIGN(20480, 256)) #define SIZE_SPS_PPS_SLICE_HDR (2048 + 4096) #define SIZE_BSE_SLICE_CMD_BUF ((((8192 << 2) + 7) & (~7)) * 3) @@ -101,6 +118,15 @@ struct iris_inst; #define NUM_MBS_4K (DIV_ROUND_UP(MAX_WIDTH, 16) * DIV_ROUND_UP(MAX_HEIGHT,= 16)) #define NUM_MBS_720P (((ALIGN(1280, 16)) >> 4) * ((ALIGN(736, 16)) >> 4)) =20 +#define BITS_PER_PIX 16 +#define NUM_LINES_LUMA 10 +#define NUM_LINES_CHROMA 6 +#define AV1D_LCU_MAX_SIZE_PELS 128 +#define AV1D_LCU_MIN_SIZE_PELS 64 +#define AV1D_MAX_TILE_COLS 64 +#define BITS_PER_CTRL_PACK 128 +#define NUM_CTRL_PACK_LCU 10 + static inline u32 size_h264d_lb_fe_top_data(u32 frame_width) { return MAX_FE_NBR_DATA_LUMA_LINE_BUFFER_SIZE * ALIGN(frame_width, 16) * 3; @@ -146,6 +172,96 @@ static inline u32 size_h264d_qp(u32 frame_width, u32 f= rame_height) return DIV_ROUND_UP(frame_width, 64) * DIV_ROUND_UP(frame_height, 64) * 1= 28; } =20 +static inline u32 size_av1d_lb_fe_top_data(u32 frame_width, u32 frame_heig= ht) +{ + return (ALIGN(frame_width, AV1D_LCU_MAX_SIZE_PELS) * + ((BITS_PER_PIX * NUM_LINES_LUMA) >> 3) + + ALIGN(frame_width, AV1D_LCU_MAX_SIZE_PELS) / 2 * + ((BITS_PER_PIX * NUM_LINES_CHROMA) >> 3) * 2); +} + +static inline u32 size_av1d_lb_fe_left_data(u32 frame_width, u32 frame_hei= ght) +{ + return (32 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 2 + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 8) * 2 + + 24 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 24 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 2 + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 12) * 2 + + 24 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 2 + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 12) * 2); +} + +static inline u32 size_av1d_lb_fe_top_ctrl(u32 frame_width, u32 frame_heig= ht) +{ + return (NUM_CTRL_PACK_LCU * ((frame_width + AV1D_LCU_MIN_SIZE_PELS - 1) / + AV1D_LCU_MIN_SIZE_PELS) * BITS_PER_CTRL_PACK / 8); +} + +static inline u32 size_av1d_lb_fe_left_ctrl(u32 frame_width, u32 frame_hei= ght) +{ + return (16 * ((ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 16) + + (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS)) + + 3 * 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS)); +} + +static inline u32 size_av1d_lb_se_top_ctrl(u32 frame_width, u32 frame_heig= ht) +{ + return (((frame_width + 7) / 8) * MAX_SE_NBR_CTRL_LCU16_LINE_BUFFER_SIZE); +} + +static inline u32 size_av1d_lb_se_left_ctrl(u32 frame_width, u32 frame_hei= ght) +{ + return (max(((frame_height + 15) / 16) * + MAX_SE_NBR_CTRL_LCU16_LINE_BUFFER_SIZE, + max(((frame_height + 31) / 32) * + MAX_SE_NBR_CTRL_LCU32_LINE_BUFFER_SIZE, + ((frame_height + 63) / 64) * + MAX_SE_NBR_CTRL_LCU64_LINE_BUFFER_SIZE))); +} + +static inline u32 size_av1d_lb_pe_top_data(u32 frame_width, u32 frame_heig= ht) +{ + return (max(((frame_width + 15) / 16) * + MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE, + max(((frame_width + 31) / 32) * + MAX_PE_NBR_DATA_LCU32_LINE_BUFFER_SIZE, + ((frame_width + 63) / 64) * + MAX_PE_NBR_DATA_LCU64_LINE_BUFFER_SIZE))); +} + +static inline u32 size_av1d_lb_vsp_top(u32 frame_width, u32 frame_height) +{ + return (max(((frame_width + 63) / 64) * 1280, + ((frame_width + 127) / 128) * 2304)); +} + +static inline u32 size_av1d_lb_recon_dma_metadata_wr(u32 frame_width, + u32 frame_height) +{ + return ((ALIGN(frame_height, 8) / (4 / 2)) * 64); +} + +static inline u32 size_av1d_qp(u32 frame_width, u32 frame_height) +{ + return size_h264d_qp(frame_width, frame_height); +} + u32 iris_vpu_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer= _type); u32 iris_vpu33_buf_size(struct iris_inst *inst, enum iris_buffer_type buff= er_type); int iris_vpu_buf_count(struct iris_inst *inst, enum iris_buffer_type buffe= r_type); --=20 2.34.1