From nobody Tue Feb 10 08:26:43 2026 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B9DB369989 for ; Fri, 17 Oct 2025 14:17:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.168.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760710651; cv=none; b=RaPOOKD9MstRIp03GVMeQTT/YoHHwjGyGZIwwIR7CCn9Sw9H9EW3oTHvg4sK4nlfRlODLZVBWhiNPMelWIc9Hish3J1cZDchVRYwTY43mYvZzJSy7iox8/9ol/2gq/iQhLuASf6QJxWazY554b8IZspaVYUQ+K13I5d2EfXHVDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760710651; c=relaxed/simple; bh=QUCzrQ27jWi4zu3mz4a5d4iizr3a4+2aOvcFaEdgCmI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=MPmkHqWfNvBJP3+3btPJUop4YxlidxbAukgxgZSKOG/Hos4uaJSiK31SfG4c5TQeFFcc1hS/F+9y4Zd6gWVp3JYa1GNNqOQSRxgtraFgzxo5FK6kkJCRCLrvAttWVZoIdBW+AqRxd33xwzX3HDFCFdBHdYn69O+LA52Tynisiec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com; spf=pass smtp.mailfrom=oss.qualcomm.com; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b=Bfga0AP1; arc=none smtp.client-ip=205.220.168.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b="Bfga0AP1" Received: from pps.filterd (m0279866.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 59H8EA0o006211 for ; Fri, 17 Oct 2025 14:17:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= p8SSPCNuUvsxVPD7C7eu+wivyoPxKDv1UCAhhnoPljM=; b=Bfga0AP1rudXbDsP EzfeXCiX3DqWqOFoQOSZ8Dx8mLHb+Ji5Q1lJHE7euKo0C74pZl/mRyDqGAwpZdRo 6UtKgLdOzKxtpLhaSRhoPNjhd4kM2ITqtJTVechhx0xpopb0eeBAhrsf6d/pOp6X eqYpqIkdIfv84wUhErOY0WGMcMB9tT3tB+phr4NkdIJJ5ePGKC9G+t/lzs1hHWji 5wWRqBmrIqzcMdAqE2kbIHSq2voeQyLkZwVUKt52KDO4r8ZsdZ7Ap3b5EuxfQcOZ txr874TKfb2tWgjpH33INgixXgnbXzOU2q5r4IrT5zmEGhP7LNsU89+Msg0ZXbI/ riPuUQ== Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 49s6mwxtan-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 17 Oct 2025 14:17:28 +0000 (GMT) Received: by mail-pf1-f200.google.com with SMTP id d2e1a72fcca58-782063922ceso2015727b3a.0 for ; Fri, 17 Oct 2025 07:17:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760710647; x=1761315447; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p8SSPCNuUvsxVPD7C7eu+wivyoPxKDv1UCAhhnoPljM=; b=Aui0ytStKzEynYdWNkMne6TRVOdmQbX3nv9e2zmMGm+GPH/BnyhJP1Qvhgk5jZy1tR 7dNy8x2sECPEIPsPoJeIDFQAyYhNA4t3+R1Y8RcIqQtLE8om21gIw7En18A70T3P2/jk 8pH2XwpzWzdj/PI6dzlD2uMcp1pNntkidsUMAPVr9kaNBjhznvsISzs9etP04uFrRZ16 sphCW5F/7Kw5CkQAjm6jgy54M4SvxX5WwUUKe9o+bGGoxd7bJnIhC0i0eKGZ5ZdazL46 SdVgl1C9SYqeojcolRM6sLRCeaX6UkSCIxUNiM9p9tP95XA5801VJLD8K323grfnFs2Y gwQA== X-Forwarded-Encrypted: i=1; AJvYcCXm/y4ePS2bauuQ3GXZwh4ZJmVHFkY6YlwYhkWIYxM9TXAK+mLEboNWp+urSEcQZ8FKt3tdoZi/kXldDL4=@vger.kernel.org X-Gm-Message-State: AOJu0Yy4EjFPE0IowfyA7Awxzjrn4Fj4HjZuq+D/9qm550RhsJUHqoPx Y2tqfhJGWA6vncYkdOVB892QE61FuAQOLIQv9oAAnmT6G2B6MeMI92KsL9HANxJZukDJUq1oXrt nps4d2r32kNp1lwhLi0cbfwf73IwrHHyYrarairBL/LWoruCiuOnRq41CIs6WVlOM1GE= X-Gm-Gg: ASbGncsh66x6ihFacHZFdgFxytkIBfmjF6QZMMTgXSyX3ymM6upeg4Lp/ag1MXnzZSv oOlWzzZWb3UNqKRx0cPel0eUzy5GA2qmueLWXoZnQAO2ka67K8XwKg1SPQ4d0IHhBbrP7vd16XY OXRT4DoxM9cLd3kjU17btoODfGeU43REyz12dc5uBRCE7bkaldcH70PAchLWrKk/6gvUSqAcNOb r4iHuLxIBZZ/F9Hf2BTChnXv/YsxQJyMpINT4m6YMbF21UdghRXD4o/d+FTKvyF42L6Vfv56px3 s8Z0kVPi2vPB+B9glEA+fKTNGDU+TTiqIAD2eVoRXSnvUzlqBRnltDZq0+QHXSQ9TffDBqk3Rf9 07+WvUEwtt4TgPsf8vOkz9xqX2KdCBvrQEg== X-Received: by 2002:a05:6a00:1911:b0:78c:97fa:6170 with SMTP id d2e1a72fcca58-7a220b10732mr5221903b3a.17.1760710647059; Fri, 17 Oct 2025 07:17:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGvyCAohehHJDVOVN0iZeGAsZf0UHakRxKtJsk8yoISo7GKWPbIiZed8kVj6CE7hAJIXFAXcA== X-Received: by 2002:a05:6a00:1911:b0:78c:97fa:6170 with SMTP id d2e1a72fcca58-7a220b10732mr5221824b3a.17.1760710646387; Fri, 17 Oct 2025 07:17:26 -0700 (PDT) Received: from hu-vgarodia-hyd.qualcomm.com ([202.46.23.25]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7992d0966d7sm25895826b3a.40.2025.10.17.07.17.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Oct 2025 07:17:25 -0700 (PDT) From: Vikash Garodia Date: Fri, 17 Oct 2025 19:46:25 +0530 Subject: [PATCH v2 4/8] media: iris: Introduce buffer size calculations for vpu4 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251017-knp_video-v2-4-f568ce1a4be3@oss.qualcomm.com> References: <20251017-knp_video-v2-0-f568ce1a4be3@oss.qualcomm.com> In-Reply-To: <20251017-knp_video-v2-0-f568ce1a4be3@oss.qualcomm.com> To: Dikshita Agarwal , Abhinav Kumar , Bryan O'Donoghue , Mauro Carvalho Chehab , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Philipp Zabel , Dmitry Baryshkov , Konrad Dybcio Cc: linux-arm-msm@vger.kernel.org, linux-media@vger.kernel.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, Vishnu Reddy , Vikash Garodia X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1760710621; l=19297; i=vikash.garodia@oss.qualcomm.com; s=20241104; h=from:subject:message-id; bh=QUCzrQ27jWi4zu3mz4a5d4iizr3a4+2aOvcFaEdgCmI=; b=eTkbvfThChqSxdWxxDjVnYxnzgbHc96WXIv9/oEjvlqiS1V3uABW3ZpD8F5YrS7wcyCsQ6jwj yd4CDYOimyCBpYmUHxNzfbFjm7YuBwc8Zj+gc8XjbIUoqEr8XNYzuZ5 X-Developer-Key: i=vikash.garodia@oss.qualcomm.com; a=ed25519; pk=LY9Eqp4KiHWxzGNKGHbwRFEJOfRCSzG/rxQNmvZvaKE= X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUxMDEzMDA4MyBTYWx0ZWRfXz5ozWKQNb4MF 3rtgNdMDcZpUHBIBiaS2wEOJ8gyzCnvQu8zVd6Zp4F01YWIjoBbVzUApdv+kjcqbhBLZ99c2tGB Be2KQ6ujxPyjppDE28xeHZUwAypSesRkm1Kn6BU3xz0oobFsT4fVw8t9Wx0i0JJ9/T+QrY/HdF+ TNyuOhD5I4VLzPTp/yw6OYjn/tAs+Xi/OtNdkfJPiPoxy1Hyv9nQ4gnz9i2v7nGmpM6wBJQb2q0 Pw2zAd8Pcx4pkGX9C3AzGA7gt4vHQW02pE3li93s7f4smbRoBRy+7BB5D5183KSYEtRRLdxmLkz 16t0xl6L7dbwoZOe7a08BH5SU2Pxh2fl/ZQiBQjEEXX39ErndycAjQQr8CEX18pQ4V9cc9NLxW2 PZ/9JnMYXZZ728RlO28zpt6obl37Fg== X-Authority-Analysis: v=2.4 cv=Fr4IPmrq c=1 sm=1 tr=0 ts=68f24ff8 cx=c_pps a=mDZGXZTwRPZaeRUbqKGCBw==:117 a=ZePRamnt/+rB5gQjfz0u9A==:17 a=IkcTkHD0fZMA:10 a=x6icFKpwvdMA:10 a=VkNPw1HP01LnGYTKEx00:22 a=COk6AnOGAAAA:8 a=EUspDBNiAAAA:8 a=BpuH9DVPwUbd3VYPi08A:9 a=QEXdDO2ut3YA:10 a=zc0IvFSfCIW2DFIPzwfm:22 a=TjNXssC_j7lpFel5tvFf:22 X-Proofpoint-GUID: TWcGH7F_n3npRRZQz48l3N5YWN4v8ESC X-Proofpoint-ORIG-GUID: TWcGH7F_n3npRRZQz48l3N5YWN4v8ESC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-10-17_05,2025-10-13_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 spamscore=0 phishscore=0 malwarescore=0 adultscore=0 lowpriorityscore=0 bulkscore=0 suspectscore=0 clxscore=1015 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2510020000 definitions=main-2510130083 Introduces vp4 buffer size calculation for both encoder and decoder. Reuse the buffer size calculation which are common, while adding the vpu4 ones separately. Co-developed-by: Vishnu Reddy Signed-off-by: Vishnu Reddy Signed-off-by: Vikash Garodia --- drivers/media/platform/qcom/iris/iris_vpu_buffer.c | 345 +++++++++++++++++= ++++ drivers/media/platform/qcom/iris/iris_vpu_buffer.h | 15 + 2 files changed, 360 insertions(+) diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c b/drivers/m= edia/platform/qcom/iris/iris_vpu_buffer.c index 4463be05ce165adef6b152eb0c155d2e6a7b3c36..8cc52d7aba3ffb968191519c1a1= a10e326403205 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.c +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.c @@ -1408,6 +1408,307 @@ static u32 iris_vpu_enc_vpss_size(struct iris_inst = *inst) return hfi_buffer_vpss_enc(width, height, ds_enable, 0, 0); } =20 +static inline u32 size_dpb_opb(u32 height, u32 lcu_size) +{ + u32 max_tile_height =3D ((height + lcu_size - 1) / lcu_size) * lcu_size += 8; + u32 dpb_opb =3D 3 * ((max_tile_height >> 3) * DMA_ALIGNMENT); + u32 num_luma_chrome_plane =3D 2; + + return dpb_opb =3D num_luma_chrome_plane * ALIGN(dpb_opb, DMA_ALIGNMENT); +} + +static u32 hfi_vpu4x_vp9d_lb_size(u32 frame_width, u32 frame_height, u32 n= um_vpp_pipes) +{ + u32 vp9_top_lb, vp9_fe_left_lb, vp9_se_left_lb, dpb_opb, vp9d_qp, num_lcu= _per_pipe; + u32 lcu_size =3D 64, fe_top_ctrl_line_numbers =3D 3, fe_top_data_luma_lin= e_numbers =3D 2, + fe_top_data_chroma_line_numbers =3D 3, fe_lft_ctrl_line_numbers =3D 4, + fe_lft_db_data_line_numbers =3D 2, fe_lft_lr_data_line_numbers =3D 4; + + vp9_top_lb =3D ALIGN(size_vp9d_lb_vsp_top(frame_width, frame_height), DMA= _ALIGNMENT); + vp9_top_lb +=3D ALIGN(size_vpxd_lb_se_top_ctrl(frame_width, frame_height)= , DMA_ALIGNMENT); + vp9_top_lb +=3D max3(DIV_ROUND_UP(frame_width, BUFFER_ALIGNMENT_16_BYTES)= * + MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE, + DIV_ROUND_UP(frame_width, BUFFER_ALIGNMENT_32_BYTES) * + MAX_PE_NBR_DATA_LCU32_LINE_BUFFER_SIZE, + DIV_ROUND_UP(frame_width, BUFFER_ALIGNMENT_64_BYTES) * + MAX_PE_NBR_DATA_LCU64_LINE_BUFFER_SIZE); + vp9_top_lb =3D ALIGN(vp9_top_lb, DMA_ALIGNMENT); + vp9_top_lb +=3D ALIGN((DMA_ALIGNMENT * DIV_ROUND_UP(frame_width, lcu_size= )), + DMA_ALIGNMENT) * fe_top_ctrl_line_numbers; + vp9_top_lb +=3D ALIGN(DMA_ALIGNMENT * 8 * DIV_ROUND_UP(frame_width, lcu_s= ize), + DMA_ALIGNMENT) * (fe_top_data_luma_line_numbers + + fe_top_data_chroma_line_numbers); + + num_lcu_per_pipe =3D (DIV_ROUND_UP(frame_height, lcu_size) / num_vpp_pipe= s) + + (DIV_ROUND_UP(frame_height, lcu_size) % num_vpp_pipes); + vp9_fe_left_lb =3D ALIGN((DMA_ALIGNMENT * num_lcu_per_pipe), DMA_ALIGNMEN= T) * + fe_lft_ctrl_line_numbers; + vp9_fe_left_lb +=3D ((ALIGN((DMA_ALIGNMENT * 8 * num_lcu_per_pipe), DMA_A= LIGNMENT) * + fe_lft_db_data_line_numbers) + + ALIGN((DMA_ALIGNMENT * 3 * num_lcu_per_pipe), DMA_ALIGNMENT) + + ALIGN((DMA_ALIGNMENT * 4 * num_lcu_per_pipe), DMA_ALIGNMENT) + + (ALIGN((DMA_ALIGNMENT * 24 * num_lcu_per_pipe), DMA_ALIGNMENT) * + fe_lft_lr_data_line_numbers)); + vp9_fe_left_lb =3D vp9_fe_left_lb * num_vpp_pipes; + + vp9_se_left_lb =3D ALIGN(size_vpxd_lb_se_left_ctrl(frame_width, frame_hei= ght), + DMA_ALIGNMENT); + dpb_opb =3D size_dpb_opb(frame_height, lcu_size); + vp9d_qp =3D ALIGN(size_vp9d_qp(frame_width, frame_height), DMA_ALIGNMENT); + + return vp9_top_lb + vp9_fe_left_lb + (vp9_se_left_lb * num_vpp_pipes) + + (dpb_opb * num_vpp_pipes) + vp9d_qp; +} + +static u32 hfi_vpu4x_buffer_line_vp9d(u32 frame_width, u32 frame_height, u= 32 _yuv_bufcount_min, + bool is_opb, u32 num_vpp_pipes) +{ + u32 lb_size =3D hfi_vpu4x_vp9d_lb_size(frame_width, frame_height, num_vpp= _pipes); + u32 dpb_obp_size =3D 0, lcu_size =3D 64; + + if (is_opb) + dpb_obp_size =3D size_dpb_opb(frame_height, lcu_size) * num_vpp_pipes; + + return lb_size + dpb_obp_size; +} + +static u32 iris_vpu4x_dec_line_size(struct iris_inst *inst) +{ + u32 num_vpp_pipes =3D inst->core->iris_platform_data->num_vpp_pipe; + u32 out_min_count =3D inst->buffers[BUF_OUTPUT].min_count; + struct v4l2_format *f =3D inst->fmt_src; + u32 height =3D f->fmt.pix_mp.height; + u32 width =3D f->fmt.pix_mp.width; + bool is_opb =3D false; + + if (iris_split_mode_enabled(inst)) + is_opb =3D true; + + if (inst->codec =3D=3D V4L2_PIX_FMT_H264) + return hfi_buffer_line_h264d(width, height, is_opb, num_vpp_pipes); + else if (inst->codec =3D=3D V4L2_PIX_FMT_HEVC) + return hfi_buffer_line_h265d(width, height, is_opb, num_vpp_pipes); + else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) + return hfi_vpu4x_buffer_line_vp9d(width, height, out_min_count, is_opb, + num_vpp_pipes); + + return 0; +} + +static u32 hfi_vpu4x_buffer_persist_h265d(u32 rpu_enabled) +{ + return ALIGN((SIZE_SLIST_BUF_H265 * NUM_SLIST_BUF_H265 + H265_NUM_FRM_INF= O * + H265_DISPLAY_BUF_SIZE + (H265_NUM_TILE * sizeof(u32)) + (NUM_HW_PIC_BUF * + (SIZE_SEI_USERDATA + SIZE_H265D_ARP + SIZE_THREE_DIMENSION_USERDATA)) + + rpu_enabled * NUM_HW_PIC_BUF * SIZE_DOLBY_RPU_METADATA), DMA_ALIGNMENT); +} + +static u32 hfi_vpu4x_buffer_persist_vp9d(void) +{ + return ALIGN(VP9_NUM_PROBABILITY_TABLE_BUF * VP9_PROB_TABLE_SIZE, DMA_ALI= GNMENT) + + (ALIGN(hfi_iris3_vp9d_comv_size(), DMA_ALIGNMENT) * 2) + + ALIGN(MAX_SUPERFRAME_HEADER_LEN, DMA_ALIGNMENT) + + ALIGN(VP9_UDC_HEADER_BUF_SIZE, DMA_ALIGNMENT) + + ALIGN(VP9_NUM_FRAME_INFO_BUF * CCE_TILE_OFFSET_SIZE, DMA_ALIGNMENT) + + ALIGN(VP9_NUM_FRAME_INFO_BUF * VP9_FRAME_INFO_BUF_SIZE_VPU4X, DMA_ALIGNM= ENT) + + HDR10_HIST_EXTRADATA_SIZE; +} + +static u32 iris_vpu4x_dec_persist_size(struct iris_inst *inst) +{ + if (inst->codec =3D=3D V4L2_PIX_FMT_H264) + return hfi_buffer_persist_h264d(); + else if (inst->codec =3D=3D V4L2_PIX_FMT_HEVC) + return hfi_vpu4x_buffer_persist_h265d(0); + else if (inst->codec =3D=3D V4L2_PIX_FMT_VP9) + return hfi_vpu4x_buffer_persist_vp9d(); + + return 0; +} + +static u32 size_se_lb(u32 standard, u32 num_vpp_pipes_enc, + u32 frame_width_coded, u32 frame_height_coded) +{ + u32 se_tlb_size =3D ALIGN(frame_width_coded, DMA_ALIGNMENT); + u32 se_llb_size =3D (standard =3D=3D HFI_CODEC_ENCODE_HEVC) ? + ((frame_height_coded + BUFFER_ALIGNMENT_32_BYTES - 1) / + BUFFER_ALIGNMENT_32_BYTES) * LOG2_16 * LLB_UNIT_SIZE : + ((frame_height_coded + BUFFER_ALIGNMENT_16_BYTES - 1) / + BUFFER_ALIGNMENT_16_BYTES) * LOG2_32 * LLB_UNIT_SIZE; + + se_llb_size =3D ALIGN(se_llb_size, BUFFER_ALIGNMENT_32_BYTES); + + if (num_vpp_pipes_enc > 1) + se_llb_size =3D ALIGN(se_llb_size + BUFFER_ALIGNMENT_512_BYTES, + DMA_ALIGNMENT) * num_vpp_pipes_enc; + + return ALIGN(se_tlb_size + se_llb_size, DMA_ALIGNMENT); +} + +static u32 size_te_lb(bool is_ten_bit, u32 num_vpp_pipes_enc, u32 width_in= _lcus, + u32 frame_height_coded, u32 frame_width_coded) +{ + u32 num_pixel_10_bit =3D 3, num_pixel_8_bit =3D 2, num_pixel_te_llb =3D 3; + u32 te_llb_col_rc_size =3D ALIGN(32 * width_in_lcus / num_vpp_pipes_enc, + DMA_ALIGNMENT) * num_vpp_pipes_enc; + u32 te_tlb_recon_data_size =3D ALIGN((is_ten_bit ? num_pixel_10_bit : num= _pixel_8_bit) * + frame_width_coded, DMA_ALIGNMENT); + u32 te_llb_recon_data_size =3D ((1 + is_ten_bit) * num_pixel_te_llb * fra= me_height_coded + + num_vpp_pipes_enc - 1) / num_vpp_pipes_enc; + te_llb_recon_data_size =3D ALIGN(te_llb_recon_data_size, DMA_ALIGNMENT) *= num_vpp_pipes_enc; + + return ALIGN(te_llb_recon_data_size + te_llb_col_rc_size + te_tlb_recon_d= ata_size, + DMA_ALIGNMENT); +} + +static inline u32 calc_fe_tlb_size(u32 size_per_lcu, bool is_ten_bit) +{ + u32 num_pixels_fe_tlb_10_bit =3D 128, num_pixels_fe_tlb_8_bit =3D 64; + + return is_ten_bit ? (num_pixels_fe_tlb_10_bit * (size_per_lcu + 1)) : + (size_per_lcu * num_pixels_fe_tlb_8_bit); +} + +static u32 size_fe_lb(bool is_ten_bit, u32 standard, u32 num_vpp_pipes_enc, + u32 frame_height_coded, u32 frame_width_coded) +{ + u32 log2_lcu_size, num_cu_in_height_pipe, num_cu_in_width, + fb_llb_db_ctrl_size, fb_llb_db_luma_size, fb_llb_db_chroma_size, + fb_tlb_db_ctrl_size, fb_tlb_db_luma_size, fb_tlb_db_chroma_size, + fb_llb_sao_ctrl_size, fb_llb_sao_luma_size, fb_llb_sao_chroma_size, + fb_tlb_sao_ctrl_size, fb_tlb_sao_luma_size, fb_tlb_sao_chroma_size, + fb_lb_top_sdc_size, fb_lb_se_ctrl_size, fe_tlb_size, size_per_lcu; + u32 fe_sdc_data_per_block =3D 16, se_ctrl_data_per_block =3D 2020; + + log2_lcu_size =3D (standard =3D=3D HFI_CODEC_ENCODE_HEVC) ? 5 : 4; + num_cu_in_height_pipe =3D ((frame_height_coded >> log2_lcu_size) + num_vp= p_pipes_enc - 1) / + num_vpp_pipes_enc; + num_cu_in_width =3D frame_width_coded >> log2_lcu_size; + + size_per_lcu =3D 2; + fe_tlb_size =3D calc_fe_tlb_size(size_per_lcu, 1); + fb_llb_db_ctrl_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_hei= ght_pipe; + fb_llb_db_ctrl_size =3D ALIGN(fb_llb_db_ctrl_size, DMA_ALIGNMENT) * num_v= pp_pipes_enc; + + size_per_lcu =3D (1 << (log2_lcu_size - 3)); + fe_tlb_size =3D calc_fe_tlb_size(size_per_lcu, is_ten_bit); + fb_llb_db_luma_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_hei= ght_pipe; + fb_llb_db_luma_size =3D ALIGN(fb_llb_db_luma_size, DMA_ALIGNMENT) * num_v= pp_pipes_enc; + + size_per_lcu =3D ((1 << (log2_lcu_size - 4)) * 2); + fe_tlb_size =3D calc_fe_tlb_size(size_per_lcu, is_ten_bit); + fb_llb_db_chroma_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_h= eight_pipe; + fb_llb_db_chroma_size =3D ALIGN(fb_llb_db_chroma_size, DMA_ALIGNMENT) * n= um_vpp_pipes_enc; + + size_per_lcu =3D 1; + fe_tlb_size =3D calc_fe_tlb_size(size_per_lcu, 1); + fb_tlb_db_ctrl_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_wid= th; + fb_llb_sao_ctrl_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_he= ight_pipe; + fb_llb_sao_ctrl_size =3D fb_llb_sao_ctrl_size * num_vpp_pipes_enc; + fb_tlb_sao_ctrl_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_wi= dth; + + size_per_lcu =3D ((1 << (log2_lcu_size - 3)) + 1); + fe_tlb_size =3D calc_fe_tlb_size(size_per_lcu, is_ten_bit); + fb_tlb_db_luma_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_wid= th; + + size_per_lcu =3D (2 * ((1 << (log2_lcu_size - 4)) + 1)); + fe_tlb_size =3D calc_fe_tlb_size(size_per_lcu, is_ten_bit); + fb_tlb_db_chroma_size =3D ALIGN(fe_tlb_size, DMA_ALIGNMENT) * num_cu_in_w= idth; + + fb_llb_sao_luma_size =3D BUFFER_ALIGNMENT_256_BYTES * num_vpp_pipes_enc; + fb_llb_sao_chroma_size =3D BUFFER_ALIGNMENT_256_BYTES * num_vpp_pipes_enc; + fb_tlb_sao_luma_size =3D BUFFER_ALIGNMENT_256_BYTES; + fb_tlb_sao_chroma_size =3D BUFFER_ALIGNMENT_256_BYTES; + fb_lb_top_sdc_size =3D ALIGN((fe_sdc_data_per_block * (frame_width_coded = >> 5)), + DMA_ALIGNMENT); + fb_lb_se_ctrl_size =3D ALIGN((se_ctrl_data_per_block * (frame_width_coded= >> 5)), + DMA_ALIGNMENT); + + return fb_llb_db_ctrl_size + fb_llb_db_luma_size + fb_llb_db_chroma_size + + fb_tlb_db_ctrl_size + fb_tlb_db_luma_size + fb_tlb_db_chroma_size + + fb_llb_sao_ctrl_size + fb_llb_sao_luma_size + fb_llb_sao_chroma_size + + fb_tlb_sao_ctrl_size + fb_tlb_sao_luma_size + fb_tlb_sao_chroma_size + + fb_lb_top_sdc_size + fb_lb_se_ctrl_size; +} + +static u32 size_md_lb(u32 standard, u32 frame_width_coded, + u32 frame_height_coded, u32 num_vpp_pipes_enc) +{ + u32 md_tlb_size =3D ALIGN(frame_width_coded, DMA_ALIGNMENT); + u32 md_llb_size =3D (standard =3D=3D HFI_CODEC_ENCODE_HEVC) ? + ((frame_height_coded + BUFFER_ALIGNMENT_32_BYTES - 1) / + BUFFER_ALIGNMENT_32_BYTES) * LOG2_16 * LLB_UNIT_SIZE : + ((frame_height_coded + BUFFER_ALIGNMENT_16_BYTES - 1) / + BUFFER_ALIGNMENT_16_BYTES) * LOG2_32 * LLB_UNIT_SIZE; + + md_llb_size =3D ALIGN(md_llb_size, BUFFER_ALIGNMENT_32_BYTES); + + if (num_vpp_pipes_enc > 1) + md_llb_size =3D ALIGN(md_llb_size + BUFFER_ALIGNMENT_512_BYTES, + DMA_ALIGNMENT) * num_vpp_pipes_enc; + + md_llb_size =3D ALIGN(md_llb_size, DMA_ALIGNMENT); + + return ALIGN(md_tlb_size + md_llb_size, DMA_ALIGNMENT); +} + +static u32 size_dma_opb_lb(u32 num_vpp_pipes_enc, u32 frame_width_coded, + u32 frame_height_coded) +{ + u32 opb_packet_bytes =3D 128, opb_bpp =3D 128, opb_size_per_row =3D 6; + u32 dma_opb_wr_tlb_y_size =3D DIV_ROUND_UP(frame_width_coded, 16) * opb_p= acket_bytes; + u32 dma_opb_wr_tlb_uv_size =3D DIV_ROUND_UP(frame_width_coded, 16) * opb_= packet_bytes; + u32 dma_opb_wr2_tlb_y_size =3D ALIGN((opb_bpp * opb_size_per_row * frame_= height_coded / 8), + DMA_ALIGNMENT) * num_vpp_pipes_enc; + u32 dma_opb_wr2_tlb_uv_size =3D ALIGN((opb_bpp * opb_size_per_row * frame= _height_coded / 8), + DMA_ALIGNMENT) * num_vpp_pipes_enc; + + dma_opb_wr2_tlb_y_size =3D max(dma_opb_wr2_tlb_y_size, dma_opb_wr_tlb_y_s= ize << 1); + dma_opb_wr2_tlb_uv_size =3D max(dma_opb_wr2_tlb_uv_size, dma_opb_wr_tlb_u= v_size << 1); + + return ALIGN(dma_opb_wr_tlb_y_size + dma_opb_wr_tlb_uv_size + dma_opb_wr2= _tlb_y_size + + dma_opb_wr2_tlb_uv_size, DMA_ALIGNMENT); +} + +static u32 hfi_vpu4x_buffer_line_enc(u32 frame_width, u32 frame_height, + bool is_ten_bit, u32 num_vpp_pipes_enc, + u32 lcu_size, u32 standard) +{ + u32 width_in_lcus =3D (frame_width + lcu_size - 1) / lcu_size; + u32 height_in_lcus =3D (frame_height + lcu_size - 1) / lcu_size; + u32 frame_width_coded =3D width_in_lcus * lcu_size; + u32 frame_height_coded =3D height_in_lcus * lcu_size; + + u32 se_lb_size =3D size_se_lb(standard, num_vpp_pipes_enc, frame_width_co= ded, + frame_height_coded); + u32 te_lb_size =3D size_te_lb(is_ten_bit, num_vpp_pipes_enc, width_in_lcu= s, + frame_height_coded, frame_width_coded); + u32 fe_lb_size =3D size_fe_lb(is_ten_bit, standard, num_vpp_pipes_enc, fr= ame_height_coded, + frame_width_coded); + u32 md_lb_size =3D size_md_lb(standard, frame_width_coded, frame_height_c= oded, + num_vpp_pipes_enc); + u32 dma_opb_lb_size =3D size_dma_opb_lb(num_vpp_pipes_enc, frame_width_co= ded, + frame_height_coded); + u32 dse_lb_size =3D ALIGN((256 + (16 * (frame_width_coded >> 4))), DMA_AL= IGNMENT); + u32 size_vpss_lb_enc =3D size_vpss_line_buf_vpu33(num_vpp_pipes_enc, fram= e_width_coded, + frame_height_coded); + + return se_lb_size + te_lb_size + fe_lb_size + md_lb_size + dma_opb_lb_siz= e + + dse_lb_size + size_vpss_lb_enc; +} + +static u32 iris_vpu4x_enc_line_size(struct iris_inst *inst) +{ + u32 num_vpp_pipes =3D inst->core->iris_platform_data->num_vpp_pipe; + u32 lcu_size =3D inst->codec =3D=3D V4L2_PIX_FMT_HEVC ? 32 : 16; + struct v4l2_format *f =3D inst->fmt_dst; + u32 height =3D f->fmt.pix_mp.height; + u32 width =3D f->fmt.pix_mp.width; + + return hfi_vpu4x_buffer_line_enc(width, height, 0, num_vpp_pipes, + lcu_size, inst->codec); +} + static int output_min_count(struct iris_inst *inst) { int output_min_count =3D 4; @@ -1503,6 +1804,50 @@ u32 iris_vpu33_buf_size(struct iris_inst *inst, enum= iris_buffer_type buffer_typ return size; } =20 +u32 iris_vpu4x_buf_size(struct iris_inst *inst, enum iris_buffer_type buff= er_type) +{ + const struct iris_vpu_buf_type_handle *buf_type_handle_arr =3D NULL; + u32 size =3D 0, buf_type_handle_size =3D 0, i; + + static const struct iris_vpu_buf_type_handle dec_internal_buf_type_handle= [] =3D { + {BUF_BIN, iris_vpu_dec_bin_size }, + {BUF_COMV, iris_vpu_dec_comv_size }, + {BUF_NON_COMV, iris_vpu_dec_non_comv_size }, + {BUF_LINE, iris_vpu4x_dec_line_size }, + {BUF_PERSIST, iris_vpu4x_dec_persist_size }, + {BUF_DPB, iris_vpu_dec_dpb_size }, + {BUF_SCRATCH_1, iris_vpu_dec_scratch1_size }, + }; + + static const struct iris_vpu_buf_type_handle enc_internal_buf_type_handle= [] =3D { + {BUF_BIN, iris_vpu_enc_bin_size }, + {BUF_COMV, iris_vpu_enc_comv_size }, + {BUF_NON_COMV, iris_vpu_enc_non_comv_size }, + {BUF_LINE, iris_vpu4x_enc_line_size }, + {BUF_ARP, iris_vpu_enc_arp_size }, + {BUF_VPSS, iris_vpu_enc_vpss_size }, + {BUF_SCRATCH_1, iris_vpu_enc_scratch1_size }, + {BUF_SCRATCH_2, iris_vpu_enc_scratch2_size }, + }; + + if (inst->domain =3D=3D DECODER) { + buf_type_handle_size =3D ARRAY_SIZE(dec_internal_buf_type_handle); + buf_type_handle_arr =3D dec_internal_buf_type_handle; + } else if (inst->domain =3D=3D ENCODER) { + buf_type_handle_size =3D ARRAY_SIZE(enc_internal_buf_type_handle); + buf_type_handle_arr =3D enc_internal_buf_type_handle; + } + + for (i =3D 0; i < buf_type_handle_size; i++) { + if (buf_type_handle_arr[i].type =3D=3D buffer_type) { + size =3D buf_type_handle_arr[i].handle(inst); + break; + } + } + + return size; +} + static u32 internal_buffer_count(struct iris_inst *inst, enum iris_buffer_type buffer_type) { diff --git a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h b/drivers/m= edia/platform/qcom/iris/iris_vpu_buffer.h index 04f0b7400a1e4e1d274d690a2761b9e57778e8b7..15037e99914afc19de9f0d38eb7= 78ef63363463b 100644 --- a/drivers/media/platform/qcom/iris/iris_vpu_buffer.h +++ b/drivers/media/platform/qcom/iris/iris_vpu_buffer.h @@ -47,7 +47,12 @@ struct iris_inst; #define VP9_NUM_PROBABILITY_TABLE_BUF (VP9_NUM_FRAME_INFO_BUF + 4) #define VP9_PROB_TABLE_SIZE (3840) #define VP9_FRAME_INFO_BUF_SIZE (6144) +#define VP9_FRAME_INFO_BUF_SIZE_VPU4X (6400) +#define BUFFER_ALIGNMENT_16_BYTES 16 #define BUFFER_ALIGNMENT_32_BYTES 32 +#define BUFFER_ALIGNMENT_64_BYTES 64 +#define BUFFER_ALIGNMENT_256_BYTES 256 +#define BUFFER_ALIGNMENT_512_BYTES 512 #define CCE_TILE_OFFSET_SIZE ALIGN(32 * 4 * 4, BUFFER_ALIGNMENT_32_BYTES) #define MAX_SUPERFRAME_HEADER_LEN (34) #define MAX_FE_NBR_CTRL_LCU64_LINE_BUFFER_SIZE 64 @@ -66,6 +71,8 @@ struct iris_inst; #define H265_CABAC_HDR_RATIO_HD_TOT 2 #define H265_CABAC_RES_RATIO_HD_TOT 2 #define SIZE_H265D_VPP_CMD_PER_BUF (256) +#define SIZE_THREE_DIMENSION_USERDATA 768 +#define SIZE_H265D_ARP 9728 =20 #define VPX_DECODER_FRAME_CONCURENCY_LVL (2) #define VPX_DECODER_FRAME_BIN_HDR_BUDGET 1 @@ -76,6 +83,9 @@ struct iris_inst; =20 #define SIZE_H264D_HW_PIC_T (BIT(11)) =20 +#define MAX_PE_NBR_DATA_LCU16_LINE_BUFFER_SIZE 96 +#define MAX_PE_NBR_DATA_LCU32_LINE_BUFFER_SIZE 192 + #define MAX_FE_NBR_CTRL_LCU64_LINE_BUFFER_SIZE 64 #define MAX_SE_NBR_CTRL_LCU64_LINE_BUFFER_SIZE 16 #define MAX_PE_NBR_DATA_LCU64_LINE_BUFFER_SIZE 384 @@ -96,6 +106,10 @@ struct iris_inst; =20 #define HFI_BUFFER_ARP_ENC 204800 =20 +#define LOG2_16 4 +#define LOG2_32 5 +#define LLB_UNIT_SIZE 16 + #define MAX_WIDTH 4096 #define MAX_HEIGHT 2304 #define NUM_MBS_4K (DIV_ROUND_UP(MAX_WIDTH, 16) * DIV_ROUND_UP(MAX_HEIGHT,= 16)) @@ -148,6 +162,7 @@ static inline u32 size_h264d_qp(u32 frame_width, u32 fr= ame_height) =20 u32 iris_vpu_buf_size(struct iris_inst *inst, enum iris_buffer_type buffer= _type); u32 iris_vpu33_buf_size(struct iris_inst *inst, enum iris_buffer_type buff= er_type); +u32 iris_vpu4x_buf_size(struct iris_inst *inst, enum iris_buffer_type buff= er_type); int iris_vpu_buf_count(struct iris_inst *inst, enum iris_buffer_type buffe= r_type); =20 #endif --=20 2.34.1