From nobody Sun Dec 14 11:17:25 2025 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49A55329C77 for ; Wed, 10 Dec 2025 18:59:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765393157; cv=none; b=m9ZFw5rCZX/BQNCAw2mi7YJjY2wLYt/d63sCAaSdf8GxzeIRGgo0Jmrf4onOys0NTK1DvkNufoQTJI8rFIrLXrPmYj6SYkE7yK0RRIhm9rzGa+HpvbL5tN6fV7IORP41sYh/1bR/JPGdTcYIFilJLUOb0/CbLzua6dDfbWlwulU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765393157; c=relaxed/simple; bh=fTkurkuNrJZkDzeoFQ0NTL4A3MjLIBcgXI+/4voJeAQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ba+QI2Cgz18V+eJihjJR6NF/XtnzFUF022RM3euIIFr4wElYmgP97hD+EUjr+hvgKVKlUMrravdC9VCg/HDH1LkgJrZxXrAO//ALVLRblL6ypYTD2DhkSuPQexVicJmxWmgXY5c/+yaPbnk9q/09IRA22IUFNZu0d28NRdcbxPI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com; spf=pass smtp.mailfrom=oss.qualcomm.com; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b=Yd9vqakR; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b=X5UJ9xac; arc=none smtp.client-ip=205.220.168.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b="Yd9vqakR"; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="X5UJ9xac" Received: from pps.filterd (m0279867.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 5BAIRT3U3178110 for ; Wed, 10 Dec 2025 18:59:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= cEcsYsit1dRgvZ0wf/p/8EfE3Mkov766qHMPRrIP/sY=; b=Yd9vqakRAQEgS0aq Q/BBoN5+g0aemAFp+ZaPhrlwXQgd4BVgOJZ4MKLPbPm54VtVtA6d/p92Ii9Sc8xm DsKmlKa2tsW9+GWjZGrorjuo68IW5P6D8dSFNO57IZtcmKiA3kV3tvaD8dKRV0Dr n828GG1wm9SHrJjWCmj3JI23XUvVmLMn569vBcIO7OeRhgBAmdsi8RIMDkBnM9zL soZ2wfDzSy05Qlyj3i7G9a6IS8/zteJXnmQsWTL0L0JmZfDK6FyhBn4YdHITSpxi oCRiaaO5daPQ/2Bc3yjk3D80dvahU+qz/3Cgrrr4Hnb9eoZsQi6O2VwdNyefLOey o/kXCg== Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4ay1xp2s0a-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Wed, 10 Dec 2025 18:59:13 +0000 (GMT) Received: by mail-pl1-f198.google.com with SMTP id d9443c01a7336-297fc324999so464325ad.3 for ; Wed, 10 Dec 2025 10:59:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1765393153; x=1765997953; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=cEcsYsit1dRgvZ0wf/p/8EfE3Mkov766qHMPRrIP/sY=; b=X5UJ9xacn5E439j++4W8vQR2M4j1Mu1vX4r2Gn9x+wOUXQi7Kybo7AkLOYDVyRPIHO TK7xBs/FbmWp1Bo22TtNTYlF7F5/j+BXPOQxGO/nvgML07Q8ggyaG18XC4/AxuIAXEIk xul4KbguyvXcoTHiM7jswEm8vrToF2JVxpGsDlSR/AKDWHYAY0S75fRt6wcduoHXQJuU bvg8y3D54BLVVBbCBd5yeL5JrhcitoC77jb/taIotVtAVB0wmZJiVnXbF3fqOT+1RMFO YhrI6QwmEO7QRApZgOZ+uN/vNLwX/tb7YBTnU/V6jzTMrjG3FEfLqwlBsr3OS9n4/jOk nsoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765393153; x=1765997953; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=cEcsYsit1dRgvZ0wf/p/8EfE3Mkov766qHMPRrIP/sY=; b=rvtVVDVNcpBFG85v7rgBLmkI1BBmPbgpRm7a/n478Db06pU2VHKw4AP9c/vx7I8A7T H5EzsmE1SO2gQKMdNYA77WsWbbI7dK51A3xpO/gvYTMH7ZcZWH9unM4HUcJkfud2R9mT 2ocU7hKmwmXjO5P2OPYZjMi/Rs5oP77rfMpIZ4Z0FXoaulhctxNM6zIyCqPyLO5g6Mg/ K1+J2QzZafnVR0CkaKC1dD4fCF9fYF5Z55OuaYB6kWQs/B5AEYZnxGQSNr4eragOYVQe 0q+Jzb/s+PL3ATKkFofdZMJLt9AOVNX6+NwWdrmFiHDtvJCjSayQCs++MLqpqoPHyc4+ 1wlw== X-Forwarded-Encrypted: i=1; AJvYcCVNe7YsdW/6BFMneGZ8SeLenR27xx99vLPz1Erk4udEYWYu8VLwgIyS67CARkgB1eDvhbpWPKuG7ElRBwM=@vger.kernel.org X-Gm-Message-State: AOJu0YxDPYZWJayv9ZoPy+q11rZJT7vrc/rLjHiDrt4EMztBnK+5nfIA kyctOMcVmYPOAzdjuy+ZpWBzB4kGR51tMFyJ1ZLHfBbH7BXMQMrw29dY8VH1teInMSZUcmnt2Fe rzl/8sTQ6hP1xTKuVPRhOpOgA64/bAtuoug9ptnHjkzuLfJ9TAqbhrI8PuaEnUEVS+kc= X-Gm-Gg: AY/fxX4v7bceIbQNc/AGTyp7iUFX5bJO3FjiBxp+fr6uJRXdjEoaEA/l+rOili8LFIC UzfC5pcxypa9zduHz2EZZonA5Y8VtkIHrQfUESjMTSqUZEmKy2hAOEZeOvv+uNDtMA6Of02PLPq KD7u9a2D0msinQI6VBxlLv5G+hNcy4HcG1S8QrZ9eJyKQ+LIPgmKlTh4XIUHOxgjpbYW8/xWAvC M1dyUia0EkXg6kJTyH/8nbzxTyMspinPJGLK1B4bq0L5d9iYwiGiBEPyZ5m/9bYhPnk1P4jzYWT iFCFyJshL5Hhsce8G8hM50EB3p/iIkseFigZ5TUgVFmEMwlyXNWvu63a2K+flyLIjTLlmoSBLJ8 kQgoFUdNFhN6pWOUgC1yA+8X7trKdCC/jeTXT6lJDGz3HU10ZOcDdq7/NJYbbNgXdF28aGQ== X-Received: by 2002:a05:7300:72d0:b0:2ab:ca55:89bd with SMTP id 5a478bee46e88-2ac0555c9ebmr1640617eec.4.1765393152711; Wed, 10 Dec 2025 10:59:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IGVmxXFq2VQyVHJFbe5OEtCpodNvzO6/ifOwp8RYXF4OeDLGSxYPgoShWziNHP+WDm+cLyKxg== X-Received: by 2002:a05:7300:72d0:b0:2ab:ca55:89bd with SMTP id 5a478bee46e88-2ac0555c9ebmr1640608eec.4.1765393152091; Wed, 10 Dec 2025 10:59:12 -0800 (PST) Received: from gu-dmadival-lv.qualcomm.com (Global_NAT1.qualcomm.com. [129.46.96.20]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-11f2e1bb45csm778008c88.1.2025.12.10.10.59.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Dec 2025 10:59:11 -0800 (PST) From: Deepa Guthyappa Madivalara Date: Wed, 10 Dec 2025 10:59:08 -0800 Subject: [PATCH v10 5/5] media: iris: Add internal buffer calculation for AV1 decoder Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251210-av1d_stateful_v3-v10-5-cf4379a3dcff@oss.qualcomm.com> References: <20251210-av1d_stateful_v3-v10-0-cf4379a3dcff@oss.qualcomm.com> In-Reply-To: <20251210-av1d_stateful_v3-v10-0-cf4379a3dcff@oss.qualcomm.com> To: Mauro Carvalho Chehab , Vikash Garodia , Dikshita Agarwal , Abhinav Kumar , Bryan O'Donoghue Cc: linux-media@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, kernel test robot , Deepa Guthyappa Madivalara X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1765393147; l=23772; i=deepa.madivalara@oss.qualcomm.com; s=20250814; h=from:subject:message-id; bh=fTkurkuNrJZkDzeoFQ0NTL4A3MjLIBcgXI+/4voJeAQ=; b=kEkBXV+NfFfTG23RzaQL/ZZG/CmrCgMpWM2B9FXgzzjxKyNoSVoSV0MPASgxk8fIqg5wECZNF T9K0mpEwaotArx2ftuOiZlDF2Ow5SUyj9jz0MYIMKwtiBvZ8QvJR4mo X-Developer-Key: i=deepa.madivalara@oss.qualcomm.com; a=ed25519; pk=MOEXgyokievn+bgpHdS6Ixh/KQYyS90z2mqIbQ822FQ= X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMjEwMDE1NSBTYWx0ZWRfXz4FlBR2wCw49 7XvCxp0hsOhOvwMMEW5Fzwiq4e6JBk0pbQYchXp6Myne50cUTaKTvdkALJHQXt0fp+sIGS61rYr 6R6SxksJv2oD8+KwSAIeypTqsyffFL7id/pHS0tMavrcyyJlOdQQri1edIEXONdYBqpvgpt31kO eqmG0x2KOU6ynMJ7sL0Kd07l8V5jM4JnjQHP5UJgn0iSGVDoBm43SsR1gwkjz7O7p9GLMpraG0J beD9iTDOuM7h0VQGQaUOqasVl8WMjiaS0Cai3G2HHe7kF0/Pfs2Q0E6MTknlKeYLi01DZJlfTpQ IbjWmt9wcvERggXmMa9rVJmeb2KaqcF3TDMwViKdg1lSDgYbnY/p3PS1MFTn1IgkUFRNwtDYm2b E0FrMSjOLmBFR4NmPUILFUFrVGD8jw== X-Proofpoint-ORIG-GUID: 4_O7XfW391awTDgfzKmouXqm4x65IXO5 X-Proofpoint-GUID: 4_O7XfW391awTDgfzKmouXqm4x65IXO5 X-Authority-Analysis: v=2.4 cv=A/Zh/qWG c=1 sm=1 tr=0 ts=6939c301 cx=c_pps a=MTSHoo12Qbhz2p7MsH1ifg==:117 a=ouPCqIW2jiPt+lZRy3xVPw==:17 a=IkcTkHD0fZMA:10 a=wP3pNCr1ah4A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=EUspDBNiAAAA:8 a=3eCS73FKbHPN4UNaIXEA:9 a=QEXdDO2ut3YA:10 a=GvdueXVYPmCkWapjIL-Q:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2025-12-10_02,2025-12-09_03,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 priorityscore=1501 adultscore=0 spamscore=0 malwarescore=0 bulkscore=0 lowpriorityscore=0 phishscore=0 suspectscore=0 impostorscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2510240001 definitions=main-2512100155 Implement internal buffer count and size calculations for AV1 decoder for all the buffer types required by the AV1 decoder, including BIN, COMV, PERSIST, LINE, and PARTIAL. This ensures the hardware decoder has properly allocated memory for AV1 decoding operations, enabling correct AV1 video playback. Reviewed-by: Dikshita Agarwal Signed-off-by: Deepa Guthyappa Madivalara --- drivers/media/platform/qcom/iris/iris_buffer.h | 1 + drivers/media/platform/qcom/iris/iris_vpu_buffer.c | 299 +++++++++++++++++= +++- drivers/media/platform/qcom/iris/iris_vpu_buffer.h | 116 ++++++++ 3 files changed, 412 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/qcom/iris/iris_buffer.h b/drivers/media= /platform/qcom/iris/iris_buffer.h index 5ef365d9236c7cbdee24a4614789b3191881968b..75bb767761824c4c02e0df9b765= 896cc093be333 100644 --- a/drivers/media/platform/qcom/iris/iris_buffer.h +++ b/drivers/media/platform/qcom/iris/iris_buffer.h @@ -27,6 +27,7 @@ struct iris_inst; * @BUF_SCRATCH_1: buffer to store decoding/encoding context data for HW * @BUF_SCRATCH_2: buffer to store encoding context data for HW * @BUF_VPSS: buffer to store VPSS context data for HW + * @BUF_PARTIAL: buffer for AV1 IBC data * @BUF_TYPE_MAX: max buffer types */ enum iris_buffer_type { diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c b/drivers/m= edia/platform/qcom/iris/iris_vpu_buffer.c index 4463be05ce165adef6b152eb0c155d2e6a7b3c36..f4985790bae4104ec5769e2de16= a56727af462ba 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c @@ -9,6 +9,17 @@ #include "iris_hfi_gen2_defines.h" =20 #define HFI_MAX_COL_FRAME 6 +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_HEIGHT (8) +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH (32) +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_HEIGHT (8) +#define HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_WIDTH (16) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_HEIGHT (4) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH (48) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_HEIGHT (4) +#define HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_WIDTH (24) +#define AV1D_SIZE_BSE_COL_MV_64x64 512 +#define AV1D_SIZE_BSE_COL_MV_128x128 2816 +#define UBWC_TILE_SIZE 256 =20 #ifndef SYSTEM_LAL_TILE10 #define SYSTEM_LAL_TILE10 192 @@ -39,6 +50,31 @@ static u32 hfi_buffer_bin_h264d(u32 frame_width, u32 fra= me_height, u32 num_vpp_p return size_h264d_hw_bin_buffer(n_aligned_w, n_aligned_h, num_vpp_pipes); } =20 +static u32 size_av1d_hw_bin_buffer(u32 frame_width, u32 frame_height, u32 = num_vpp_pipes) +{ + u32 size_yuv, size_bin_hdr, size_bin_res; + + size_yuv =3D ((frame_width * frame_height) <=3D BIN_BUFFER_THRESHOLD) ? + ((BIN_BUFFER_THRESHOLD * 3) >> 1) : + ((frame_width * frame_height * 3) >> 1); + size_bin_hdr =3D size_yuv * AV1_CABAC_HDR_RATIO_HD_TOT; + size_bin_res =3D size_yuv * AV1_CABAC_RES_RATIO_HD_TOT; + size_bin_hdr =3D ALIGN(size_bin_hdr / num_vpp_pipes, + DMA_ALIGNMENT) * num_vpp_pipes; + size_bin_res =3D ALIGN(size_bin_res / num_vpp_pipes, + DMA_ALIGNMENT) * num_vpp_pipes; + + return size_bin_hdr + size_bin_res; +} + +static u32 hfi_buffer_bin_av1d(u32 frame_width, u32 frame_height, u32 num_= vpp_pipes) +{ + u32 n_aligned_h =3D ALIGN(frame_height, 16); + u32 n_aligned_w =3D ALIGN(frame_width, 16); + + return size_av1d_hw_bin_buffer(n_aligned_w, n_aligned_h, num_vpp_pipes); +} + static u32 size_h265d_hw_bin_buffer(u32 frame_width, u32 frame_height, u32= num_vpp_pipes) { u32 product =3D frame_width * frame_height; @@ -110,6 +146,26 @@ static u32 hfi_buffer_comv_h265d(u32 frame_width, u32 = frame_height, u32 _comv_bu return (_size * (_comv_bufcount)) + 512; } =20 +static u32 num_lcu(u32 frame_width, u32 frame_height, u32 lcu_size) +{ + return ((frame_width + lcu_size - 1) / lcu_size) * + ((frame_height + lcu_size - 1) / lcu_size); +} + +static u32 hfi_buffer_comv_av1d(u32 frame_width, u32 frame_height, u32 com= v_bufcount) +{ + u32 size; + + size =3D 2 * ALIGN(max(num_lcu(frame_width, frame_height, 64) * + AV1D_SIZE_BSE_COL_MV_64x64, + num_lcu(frame_width, frame_height, 128) * + AV1D_SIZE_BSE_COL_MV_128x128), + DMA_ALIGNMENT); + size *=3D comv_bufcount; + + return size; +} + static u32 size_h264d_bse_cmd_buf(u32 frame_height) { u32 height =3D ALIGN(frame_height, 32); @@ -122,7 +178,7 @@ static u32 size_h265d_bse_cmd_buf(u32 frame_width, u32 = frame_height) { u32 _size =3D ALIGN(((ALIGN(frame_width, LCU_MAX_SIZE_PELS) / LCU_MIN_SIZ= E_PELS) * (ALIGN(frame_height, LCU_MAX_SIZE_PELS) / LCU_MIN_SIZE_PELS)) * - NUM_HW_PIC_BUF, DMA_ALIGNMENT); + NUM_HW_PIC_BUF, DMA_ALIGNMENT); _size =3D min_t(u32, _size, H265D_MAX_SLICE + 1); _size =3D 2 * _size * SIZE_H265D_BSE_CMD_PER_BUF; =20 @@ -174,6 +230,20 @@ static u32 hfi_buffer_persist_h264d(void) DMA_ALIGNMENT); } =20 +static u32 hfi_buffer_persist_av1d(u32 max_width, u32 max_height, u32 tota= l_ref_count) +{ + u32 comv_size, size; + + comv_size =3D hfi_buffer_comv_av1d(max_width, max_height, total_ref_coun= t); + size =3D ALIGN((SIZE_AV1D_SEQUENCE_HEADER * 2 + SIZE_AV1D_METADATA + + AV1D_NUM_HW_PIC_BUF * (SIZE_AV1D_TILE_OFFSET + SIZE_AV1D_QM) + + AV1D_NUM_FRAME_HEADERS * (SIZE_AV1D_FRAME_HEADER + + 2 * SIZE_AV1D_PROB_TABLE) + comv_size + HDR10_HIST_EXTRADATA_SIZE + + SIZE_AV1D_METADATA * AV1D_NUM_HW_PIC_BUF), DMA_ALIGNMENT); + + return ALIGN(size, DMA_ALIGNMENT); +} + static u32 hfi_buffer_non_comv_h264d(u32 frame_width, u32 frame_height, u3= 2 num_vpp_pipes) { u32 size_bse =3D size_h264d_bse_cmd_buf(frame_height); @@ -459,6 +529,182 @@ static u32 hfi_buffer_line_h264d(u32 frame_width, u32= frame_height, return ALIGN((size + vpss_lb_size), DMA_ALIGNMENT); } =20 +static u32 size_av1d_lb_opb_wr1_nv12_ubwc(u32 frame_width, u32 frame_heigh= t) +{ + u32 size, y_width, y_width_a =3D 128; + + y_width =3D ALIGN(frame_width, y_width_a); + + size =3D ((y_width + HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH - 1) / + HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH + + (AV1D_MAX_TILE_COLS - 1)); + return size * UBWC_TILE_SIZE; +} + +static u32 size_av1d_lb_opb_wr1_tp10_ubwc(u32 frame_width, u32 frame_heigh= t) +{ + u32 size, y_width, y_width_a =3D 256; + + y_width =3D ALIGN(frame_width, y_width_a); + + size =3D ((y_width + HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH - 1) / + HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH + + (AV1D_MAX_TILE_COLS - 1)); + + return size * UBWC_TILE_SIZE; +} + +static u32 hfi_buffer_line_av1d(u32 frame_width, u32 frame_height, + bool is_opb, u32 num_vpp_pipes) +{ + u32 size, vpss_lb_size, opbwrbufsize, opbwr8, opbwr10; + + size =3D ALIGN(size_av1d_lb_fe_top_data(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_fe_top_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_fe_left_data(frame_width, frame_height), + DMA_ALIGNMENT) * num_vpp_pipes + + ALIGN(size_av1d_lb_fe_left_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) * num_vpp_pipes + + ALIGN(size_av1d_lb_se_left_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) * num_vpp_pipes + + ALIGN(size_av1d_lb_se_top_ctrl(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_pe_top_data(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_vsp_top(frame_width, frame_height), + DMA_ALIGNMENT) + + ALIGN(size_av1d_lb_recon_dma_metadata_wr + (frame_width, frame_height), DMA_ALIGNMENT) * 2 + + ALIGN(size_av1d_qp(frame_width, frame_height), DMA_ALIGNMENT); + opbwr8 =3D size_av1d_lb_opb_wr1_nv12_ubwc(frame_width, frame_height); + opbwr10 =3D size_av1d_lb_opb_wr1_tp10_ubwc(frame_width, frame_height); + opbwrbufsize =3D opbwr8 >=3D opbwr10 ? opbwr8 : opbwr10; + size =3D ALIGN((size + opbwrbufsize), DMA_ALIGNMENT); + if (is_opb) { + vpss_lb_size =3D size_vpss_lb(frame_width, frame_height); + size =3D ALIGN((size + vpss_lb_size) * 2, DMA_ALIGNMENT); + } + + return size; +} + +static u32 size_av1d_ibc_nv12_ubwc(u32 frame_width, u32 frame_height) +{ + u32 size; + u32 y_width_a =3D 128, y_height_a =3D 32; + u32 uv_width_a =3D 128, uv_height_a =3D 32; + u32 ybufsize, uvbufsize, y_width, y_height, uv_width, uv_height; + u32 y_meta_width_a =3D 64, y_meta_height_a =3D 16; + u32 uv_meta_width_a =3D 64, uv_meta_height_a =3D 16; + u32 meta_height, meta_stride, meta_size; + u32 tile_width_y =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_WIDTH; + u32 tile_height_y =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_Y_TILE_HEIGHT; + u32 tile_width_uv =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_WIDTH; + u32 tile_height_uv =3D HFI_COLOR_FORMAT_YUV420_NV12_UBWC_UV_TILE_HEIGHT; + + y_width =3D ALIGN(frame_width, y_width_a); + y_height =3D ALIGN(frame_height, y_height_a); + uv_width =3D ALIGN(frame_width, uv_width_a); + uv_height =3D ALIGN(((frame_height + 1) >> 1), uv_height_a); + ybufsize =3D ALIGN((y_width * y_height), HFI_ALIGNMENT_4096); + uvbufsize =3D ALIGN(uv_width * uv_height, HFI_ALIGNMENT_4096); + size =3D ybufsize + uvbufsize; + meta_stride =3D ALIGN(((frame_width + (tile_width_y - 1)) / tile_width_y), + y_meta_width_a); + meta_height =3D ALIGN(((frame_height + (tile_height_y - 1)) / tile_height= _y), + y_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + meta_stride =3D ALIGN(((((frame_width + 1) >> 1) + (tile_width_uv - 1)) / + tile_width_uv), uv_meta_width_a); + meta_height =3D ALIGN(((((frame_height + 1) >> 1) + (tile_height_uv - 1))= / + tile_height_uv), uv_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + + return size; +} + +static u32 hfi_yuv420_tp10_calc_y_stride(u32 frame_width, u32 stride_multi= ple) +{ + u32 stride; + + stride =3D ALIGN(frame_width, 192); + stride =3D ALIGN(stride * 4 / 3, stride_multiple); + + return stride; +} + +static u32 hfi_yuv420_tp10_calc_y_bufheight(u32 frame_height, u32 min_buf_= height_multiple) +{ + return ALIGN(frame_height, min_buf_height_multiple); +} + +static u32 hfi_yuv420_tp10_calc_uv_stride(u32 frame_width, u32 stride_mult= iple) +{ + u32 stride; + + stride =3D ALIGN(frame_width, 192); + stride =3D ALIGN(stride * 4 / 3, stride_multiple); + + return stride; +} + +static u32 hfi_yuv420_tp10_calc_uv_bufheight(u32 frame_height, u32 min_buf= _height_multiple) +{ + return ALIGN(((frame_height + 1) >> 1), min_buf_height_multiple); +} + +static u32 size_av1d_ibc_tp10_ubwc(u32 frame_width, u32 frame_height) +{ + u32 size; + u32 y_width_a =3D 256, y_height_a =3D 16, + uv_width_a =3D 256, uv_height_a =3D 16; + u32 ybufsize, uvbufsize, y_width, y_height, uv_width, uv_height; + u32 y_meta_width_a =3D 64, y_meta_height_a =3D 16, + uv_meta_width_a =3D 64, uv_meta_height_a =3D 16; + u32 meta_height, meta_stride, meta_size; + u32 tile_width_y =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_WIDTH; + u32 tile_height_y =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_Y_TILE_HEIGHT; + u32 tile_width_uv =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_WIDTH; + u32 tile_height_uv =3D HFI_COLOR_FORMAT_YUV420_TP10_UBWC_UV_TILE_HEIGHT; + + y_width =3D hfi_yuv420_tp10_calc_y_stride(frame_width, y_width_a); + y_height =3D hfi_yuv420_tp10_calc_y_bufheight(frame_height, y_height_a); + uv_width =3D hfi_yuv420_tp10_calc_uv_stride(frame_width, uv_width_a); + uv_height =3D hfi_yuv420_tp10_calc_uv_bufheight(frame_height, uv_height_a= ); + ybufsize =3D ALIGN(y_width * y_height, HFI_ALIGNMENT_4096); + uvbufsize =3D ALIGN(uv_width * uv_height, HFI_ALIGNMENT_4096); + size =3D ybufsize + uvbufsize; + meta_stride =3D ALIGN(((frame_width + (tile_width_y - 1)) / tile_width_y), + y_meta_width_a); + meta_height =3D ALIGN(((frame_height + (tile_height_y - 1)) / tile_height= _y), + y_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + meta_stride =3D ALIGN(((((frame_width + 1) >> 1) + (tile_width_uv - 1)) / + tile_width_uv), uv_meta_width_a); + meta_height =3D ALIGN(((((frame_height + 1) >> 1) + (tile_height_uv - 1))= / + tile_height_uv), uv_meta_height_a); + meta_size =3D ALIGN(meta_stride * meta_height, HFI_ALIGNMENT_4096); + size +=3D meta_size; + + return size; +} + +static u32 hfi_buffer_ibc_av1d(u32 frame_width, u32 frame_height) +{ + u32 size, ibc8, ibc10; + + ibc8 =3D size_av1d_ibc_nv12_ubwc(frame_width, frame_height); + ibc10 =3D size_av1d_ibc_tp10_ubwc(frame_width, frame_height); + size =3D ibc8 >=3D ibc10 ? ibc8 : ibc10; + + return ALIGN(size, DMA_ALIGNMENT); +} + static u32 iris_vpu_dec_bin_size(struct iris_inst *inst) { u32 num_vpp_pipes =3D inst->core->iris_platform_data->num_vpp_pipe; @@ -472,6 +718,8 @@ static u32 iris_vpu_dec_bin_size(struct iris_inst *inst) return hfi_buffer_bin_h265d(width, height, num_vpp_pipes); else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) return hfi_buffer_bin_vp9d(width, height, num_vpp_pipes); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return hfi_buffer_bin_av1d(width, height, num_vpp_pipes); =20 return 0; } @@ -487,18 +735,34 @@ static u32 iris_vpu_dec_comv_size(struct iris_inst *i= nst) return hfi_buffer_comv_h264d(width, height, num_comv); else if (inst->codec =3D=3D V4L2_PIX_FMT_HEVC) return hfi_buffer_comv_h265d(width, height, num_comv); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) { + if (inst->fw_caps[DRAP].value) + return 0; + else + return hfi_buffer_comv_av1d(width, height, num_comv); + } =20 return 0; } =20 static u32 iris_vpu_dec_persist_size(struct iris_inst *inst) { + struct platform_inst_caps *caps; + if (inst->codec =3D=3D V4L2_PIX_FMT_H264) return hfi_buffer_persist_h264d(); else if (inst->codec =3D=3D V4L2_PIX_FMT_HEVC) return hfi_buffer_persist_h265d(0); else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) return hfi_buffer_persist_vp9d(); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) { + caps =3D inst->core->iris_platform_data->inst_caps; + if (inst->fw_caps[DRAP].value) + return hfi_buffer_persist_av1d(caps->max_frame_width, + caps->max_frame_height, 16); + else + return hfi_buffer_persist_av1d(0, 0, 0); + } =20 return 0; } @@ -545,6 +809,8 @@ static u32 iris_vpu_dec_line_size(struct iris_inst *ins= t) else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) return hfi_buffer_line_vp9d(width, height, out_min_count, is_opb, num_vpp_pipes); + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return hfi_buffer_line_av1d(width, height, is_opb, num_vpp_pipes); =20 return 0; } @@ -653,6 +919,15 @@ static u32 iris_vpu_enc_bin_size(struct iris_inst *ins= t) num_vpp_pipes, inst->hfi_rc_type); } =20 +static u32 iris_vpu_dec_partial_size(struct iris_inst *inst) +{ + struct v4l2_format *f =3D inst->fmt_src; + u32 height =3D f->fmt.pix_mp.height; + u32 width =3D f->fmt.pix_mp.width; + + return hfi_buffer_ibc_av1d(width, height); +} + static inline u32 hfi_buffer_comv_enc(u32 frame_width, u32 frame_height, u32 lcu_size, u32 num_recon, u32 standard) @@ -1414,7 +1689,9 @@ static int output_min_count(struct iris_inst *inst) =20 /* fw_min_count > 0 indicates reconfig event has already arrived */ if (inst->fw_min_count) { - if (iris_split_mode_enabled(inst) && inst->codec =3D=3D V4L2_PIX_FMT_VP9) + if (iris_split_mode_enabled(inst) && + (inst->codec =3D=3D V4L2_PIX_FMT_VP9 || + inst->codec =3D=3D V4L2_PIX_FMT_AV1)) return min_t(u32, 4, inst->fw_min_count); else return inst->fw_min_count; @@ -1422,6 +1699,8 @@ static int output_min_count(struct iris_inst *inst) =20 if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) output_min_count =3D 9; + else if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + output_min_count =3D 11; =20 return output_min_count; } @@ -1444,6 +1723,7 @@ u32 iris_vpu_buf_size(struct iris_inst *inst, enum ir= is_buffer_type buffer_type) {BUF_PERSIST, iris_vpu_dec_persist_size }, {BUF_DPB, iris_vpu_dec_dpb_size }, {BUF_SCRATCH_1, iris_vpu_dec_scratch1_size }, + {BUF_PARTIAL, iris_vpu_dec_partial_size }, }; =20 static const struct iris_vpu_buf_type_handle enc_internal_buf_type_handle= [] =3D { @@ -1510,14 +1790,20 @@ static u32 internal_buffer_count(struct iris_inst *= inst, buffer_type =3D=3D BUF_PERSIST) { return 1; } else if (buffer_type =3D=3D BUF_COMV || buffer_type =3D=3D BUF_NON_COMV= ) { - if (inst->codec =3D=3D V4L2_PIX_FMT_H264 || inst->codec =3D=3D V4L2_PIX_= FMT_HEVC) + if (inst->codec =3D=3D V4L2_PIX_FMT_H264 || + inst->codec =3D=3D V4L2_PIX_FMT_HEVC || + inst->codec =3D=3D V4L2_PIX_FMT_AV1) return 1; } + return 0; } =20 static inline int iris_vpu_dpb_count(struct iris_inst *inst) { + if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return 11; + if (iris_split_mode_enabled(inst)) { return inst->fw_min_count ? inst->fw_min_count : inst->buffers[BUF_OUTPUT].min_count; @@ -1536,9 +1822,13 @@ int iris_vpu_buf_count(struct iris_inst *inst, enum = iris_buffer_type buffer_type return MIN_BUFFERS; else return output_min_count(inst); + case BUF_NON_COMV: + if (inst->codec =3D=3D V4L2_PIX_FMT_AV1) + return 0; + else + return 1; case BUF_BIN: case BUF_COMV: - case BUF_NON_COMV: case BUF_LINE: case BUF_PERSIST: return internal_buffer_count(inst, buffer_type); @@ -1546,6 +1836,7 @@ int iris_vpu_buf_count(struct iris_inst *inst, enum i= ris_buffer_type buffer_type case BUF_SCRATCH_2: case BUF_VPSS: case BUF_ARP: + case BUF_PARTIAL: return 1; /* internal buffer count needed by firmware is 1 */ case BUF_DPB: return iris_vpu_dpb_count(inst); diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/drivers/m= edia/platform/qcom/iris/iris_vpu_buffer.h index 04f0b7400a1e4e1d274d690a2761b9e57778e8b7..13c7199fcf351cac5f01c0d1ca2= 0c093b1e62234 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h @@ -11,6 +11,7 @@ struct iris_inst; #define MIN_BUFFERS 4 =20 #define DMA_ALIGNMENT 256 +#define HFI_ALIGNMENT_4096 4096 =20 #define NUM_HW_PIC_BUF 32 #define LCU_MAX_SIZE_PELS 64 @@ -81,6 +82,22 @@ struct iris_inst; #define MAX_PE_NBR_DATA_LCU64_LINE_BUFFER_SIZE 384 #define MAX_FE_NBR_DATA_LUMA_LINE_BUFFER_SIZE 640 =20 +#define AV1_CABAC_HDR_RATIO_HD_TOT 2 +#define AV1_CABAC_RES_RATIO_HD_TOT 2 +#define AV1D_LCU_MAX_SIZE_PELS 128 +#define AV1D_LCU_MIN_SIZE_PELS 64 +#define AV1D_MAX_TILE_COLS 64 +#define MAX_PE_NBR_DATA_LCU32_LINE_BUFFER_SIZE 192 +#define MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE 96 +#define AV1D_NUM_HW_PIC_BUF 16 +#define AV1D_NUM_FRAME_HEADERS 16 +#define SIZE_AV1D_SEQUENCE_HEADER 768 +#define SIZE_AV1D_METADATA 512 +#define SIZE_AV1D_FRAME_HEADER 1280 +#define SIZE_AV1D_TILE_OFFSET 65536 +#define SIZE_AV1D_QM 3328 +#define SIZE_AV1D_PROB_TABLE 22784 + #define SIZE_SLICE_CMD_BUFFER (ALIGN(20480, 256)) #define SIZE_SPS_PPS_SLICE_HDR (2048 + 4096) #define SIZE_BSE_SLICE_CMD_BUF ((((8192 << 2) + 7) & (~7)) * 3) @@ -101,6 +118,15 @@ struct iris_inst; #define NUM_MBS_4K (DIV_ROUND_UP(MAX_WIDTH, 16) * DIV_ROUND_UP(MAX_HEIGHT,= 16)) #define NUM_MBS_720P (((ALIGN(1280, 16)) >> 4) * ((ALIGN(736, 16)) >> 4)) =20 +#define BITS_PER_PIX 16 +#define NUM_LINES_LUMA 10 +#define NUM_LINES_CHROMA 6 +#define AV1D_LCU_MAX_SIZE_PELS 128 +#define AV1D_LCU_MIN_SIZE_PELS 64 +#define AV1D_MAX_TILE_COLS 64 +#define BITS_PER_CTRL_PACK 128 +#define NUM_CTRL_PACK_LCU 10 + static inline u32 size_h264d_lb_fe_top_data(u32 frame_width) { return MAX_FE_NBR_DATA_LUMA_LINE_BUFFER_SIZE * ALIGN(frame_width, 16) * 3; @@ -146,6 +172,96 @@ static inline u32 size_h264d_qp(u32 frame_width, u32 f= rame_height) return DIV_ROUND_UP(frame_width, 64) * DIV_ROUND_UP(frame_height, 64) * 1= 28; } =20 +static inline u32 size_av1d_lb_fe_top_data(u32 frame_width, u32 frame_heig= ht) +{ + return (ALIGN(frame_width, AV1D_LCU_MAX_SIZE_PELS) * + ((BITS_PER_PIX * NUM_LINES_LUMA) >> 3) + + ALIGN(frame_width, AV1D_LCU_MAX_SIZE_PELS) / 2 * + ((BITS_PER_PIX * NUM_LINES_CHROMA) >> 3) * 2); +} + +static inline u32 size_av1d_lb_fe_left_data(u32 frame_width, u32 frame_hei= ght) +{ + return (32 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 2 + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 8) * 2 + + 24 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 24 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 2 + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 12) * 2 + + 24 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 16) + + 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 2 + + ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS * 12) * 2); +} + +static inline u32 size_av1d_lb_fe_top_ctrl(u32 frame_width, u32 frame_heig= ht) +{ + return (NUM_CTRL_PACK_LCU * ((frame_width + AV1D_LCU_MIN_SIZE_PELS - 1) / + AV1D_LCU_MIN_SIZE_PELS) * BITS_PER_CTRL_PACK / 8); +} + +static inline u32 size_av1d_lb_fe_left_ctrl(u32 frame_width, u32 frame_hei= ght) +{ + return (16 * ((ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / 16) + + (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS)) + + 3 * 16 * (ALIGN(frame_height, AV1D_LCU_MAX_SIZE_PELS) / + AV1D_LCU_MIN_SIZE_PELS)); +} + +static inline u32 size_av1d_lb_se_top_ctrl(u32 frame_width, u32 frame_heig= ht) +{ + return (((frame_width + 7) / 8) * MAX_SE_NBR_CTRL_LCU16_LINE_BUFFER_SIZE); +} + +static inline u32 size_av1d_lb_se_left_ctrl(u32 frame_width, u32 frame_hei= ght) +{ + return (max(((frame_height + 15) / 16) * + MAX_SE_NBR_CTRL_LCU16_LINE_BUFFER_SIZE, + max(((frame_height + 31) / 32) * + MAX_SE_NBR_CTRL_LCU32_LINE_BUFFER_SIZE, + ((frame_height + 63) / 64) * + MAX_SE_NBR_CTRL_LCU64_LINE_BUFFER_SIZE))); +} + +static inline u32 size_av1d_lb_pe_top_data(u32 frame_width, u32 frame_heig= ht) +{ + return (max(((frame_width + 15) / 16) * + MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE, + max(((frame_width + 31) / 32) * + MAX_PE_NBR_DATA_LCU32_LINE_BUFFER_SIZE, + ((frame_width + 63) / 64) * + MAX_PE_NBR_DATA_LCU64_LINE_BUFFER_SIZE))); +} + +static inline u32 size_av1d_lb_vsp_top(u32 frame_width, u32 frame_height) +{ + return (max(((frame_width + 63) / 64) * 1280, + ((frame_width + 127) / 128) * MAX_HEIGHT)); +} + +static inline u32 size_av1d_lb_recon_dma_metadata_wr(u32 frame_width, + u32 frame_height) +{ + return ((ALIGN(frame_height, 8) / (4 / 2)) * 64); +} + +static inline u32 size_av1d_qp(u32 frame_width, u32 frame_height) +{ + return size_h264d_qp(frame_width, frame_height); +} + u32 iris_vpu_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer= _type); u32 iris_vpu33_buf_size(struct iris_inst *inst, enum iris_buffer_type buff= er_type); int iris_vpu_buf_count(struct iris_inst *inst, enum iris_buffer_type buffe= r_type); --=20 2.34.1