From nobody Tue Jun 23 10:12:34 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA631C433F5 for ; Mon, 7 Mar 2022 15:48:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243929AbiCGPtk (ORCPT ); Mon, 7 Mar 2022 10:49:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243926AbiCGPtZ (ORCPT ); Mon, 7 Mar 2022 10:49:25 -0500 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCBB660DB2 for ; Mon, 7 Mar 2022 07:48:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=is4Wucf0t+NH2XPrCi6O1Zp3ETIT4WXTUQQTsoL+jLc=; b=bUbVSalsAOAzPa6svAGMaSDFTb MfV99vhb4r26qzFAazJAW/EuWGGOP41gSAe2jhxeUSEBqytVLgFPAnx0idtL7E+R0aGwSO3WKzufh MZVCFsVoqJJq9EmVeijGvctZuumvGpS1fDQvq1Ge3RQIsdD0+s11dXdvUSajJ3DT0a1zc7AhS9CzC O96qKVbnuWkF9VFoBMpjbWmMy2Uw1rELFA5Hgy1nLc1+3QLKBYCSHmNZsbujsgJwc9IC+skZVCZh6 E4uofeoqtZeBMuTk9LHcZJ+IPYocfRjSek+hxXvN194q4DdsjPenVIo3W60X9jdGxdh7HjUun5SvV CMBUW5SA==; Received: from [192.168.12.191] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nRFav-0002rG-7Y; Mon, 07 Mar 2022 16:48:21 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Cc: Dmytro Laktyushkin , Jasdeep Dhillon , Qingqing Zhuo , Melissa Wen , linux-kernel@vger.kernel.org Subject: [PATCH 1/3] drm/amd/dicplay: move FPU related code from dcn31 to dml/dcn31 folder Date: Mon, 7 Mar 2022 14:47:59 -0100 Message-Id: <20220307154801.2196284-2-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220307154801.2196284-1-mwen@igalia.com> References: <20220307154801.2196284-1-mwen@igalia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Creates FPU files in dml/dcn31 folder to centralize FPU operations from 3.1x drivers and moves all FPU-associated code from dcn31 driver to there. It includes the struct _vcs_dpi_ip_params_st and _vcs_dpi_soc_bounding_box_st and functions: - dcn31_calculate_wm_and_dlg_fp() - dcn31_update_bw_bounding_box() adding dc_assert_fp_enabled to them and drop DC_FP_START/END inside functions that was moved to dml folder, as required. Signed-off-by: Melissa Wen --- drivers/gpu/drm/amd/display/dc/dcn31/Makefile | 26 -- .../drm/amd/display/dc/dcn31/dcn31_resource.c | 355 +-------------- .../drm/amd/display/dc/dcn31/dcn31_resource.h | 4 +- drivers/gpu/drm/amd/display/dc/dml/Makefile | 2 + .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 406 ++++++++++++++++++ .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 39 ++ 6 files changed, 451 insertions(+), 381 deletions(-) create mode 100644 drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c create mode 100644 drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile b/drivers/gpu/dr= m/amd/display/dc/dcn31/Makefile index d20e3b8ccc30..ec041e3cda30 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile @@ -15,32 +15,6 @@ DCN31 =3D dcn31_resource.o dcn31_hubbub.o dcn31_hwseq.o = dcn31_init.o dcn31_hubp.o dcn31_apg.o dcn31_hpo_dp_stream_encoder.o dcn31_hpo_dp_link_encoder.o \ dcn31_afmt.o dcn31_vpg.o =20 -ifdef CONFIG_X86 -CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o :=3D -msse -endif - -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o :=3D -mhard-float -maltivec -endif - -ifdef CONFIG_CC_IS_GCC -ifeq ($(call cc-ifversion, -lt, 0701, y), y) -IS_OLD_GCC =3D 1 -endif -CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o +=3D -mhard-float -endif - -ifdef CONFIG_X86 -ifdef IS_OLD_GCC -# Stack alignment mismatch, proceed with caution. -# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-bound= ary=3D3 -# (8B stack alignment). -CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o +=3D -mpreferred-stack-boun= dary=3D4 -else -CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o +=3D -msse2 -endif -endif - AMD_DAL_DCN31 =3D $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31)) =20 AMD_DISPLAY_FILES +=3D $(AMD_DAL_DCN31) diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/driver= s/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c index e8f38f4a9378..0e51ac029c8a 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c @@ -65,6 +65,7 @@ #include "virtual/virtual_stream_encoder.h" #include "dce110/dce110_resource.h" #include "dml/display_mode_vba.h" +#include "dml/dcn31/dcn31_fpu.h" #include "dcn31/dcn31_dccg.h" #include "dcn10/dcn10_resource.h" #include "dcn31_panel_cntl.h" @@ -102,152 +103,6 @@ =20 #define DC_LOGGER_INIT(logger) =20 -#define DCN3_1_DEFAULT_DET_SIZE 384 - -struct _vcs_dpi_ip_params_st dcn3_1_ip =3D { - .gpuvm_enable =3D 1, - .gpuvm_max_page_table_levels =3D 1, - .hostvm_enable =3D 1, - .hostvm_max_page_table_levels =3D 2, - .rob_buffer_size_kbytes =3D 64, - .det_buffer_size_kbytes =3D DCN3_1_DEFAULT_DET_SIZE, - .config_return_buffer_size_in_kbytes =3D 1792, - .compressed_buffer_segment_size_in_kbytes =3D 64, - .meta_fifo_size_in_kentries =3D 32, - .zero_size_buffer_entries =3D 512, - .compbuf_reserved_space_64b =3D 256, - .compbuf_reserved_space_zs =3D 64, - .dpp_output_buffer_pixels =3D 2560, - .opp_output_buffer_lines =3D 1, - .pixel_chunk_size_kbytes =3D 8, - .meta_chunk_size_kbytes =3D 2, - .min_meta_chunk_size_bytes =3D 256, - .writeback_chunk_size_kbytes =3D 8, - .ptoi_supported =3D false, - .num_dsc =3D 3, - .maximum_dsc_bits_per_component =3D 10, - .dsc422_native_support =3D false, - .is_line_buffer_bpp_fixed =3D true, - .line_buffer_fixed_bpp =3D 48, - .line_buffer_size_bits =3D 789504, - .max_line_buffer_lines =3D 12, - .writeback_interface_buffer_size_kbytes =3D 90, - .max_num_dpp =3D 4, - .max_num_otg =3D 4, - .max_num_hdmi_frl_outputs =3D 1, - .max_num_wb =3D 1, - .max_dchub_pscl_bw_pix_per_clk =3D 4, - .max_pscl_lb_bw_pix_per_clk =3D 2, - .max_lb_vscl_bw_pix_per_clk =3D 4, - .max_vscl_hscl_bw_pix_per_clk =3D 4, - .max_hscl_ratio =3D 6, - .max_vscl_ratio =3D 6, - .max_hscl_taps =3D 8, - .max_vscl_taps =3D 8, - .dpte_buffer_size_in_pte_reqs_luma =3D 64, - .dpte_buffer_size_in_pte_reqs_chroma =3D 34, - .dispclk_ramp_margin_percent =3D 1, - .max_inter_dcn_tile_repeaters =3D 8, - .cursor_buffer_size =3D 16, - .cursor_chunk_size =3D 2, - .writeback_line_buffer_buffer_size =3D 0, - .writeback_min_hscl_ratio =3D 1, - .writeback_min_vscl_ratio =3D 1, - .writeback_max_hscl_ratio =3D 1, - .writeback_max_vscl_ratio =3D 1, - .writeback_max_hscl_taps =3D 1, - .writeback_max_vscl_taps =3D 1, - .dppclk_delay_subtotal =3D 46, - .dppclk_delay_scl =3D 50, - .dppclk_delay_scl_lb_only =3D 16, - .dppclk_delay_cnvc_formatter =3D 27, - .dppclk_delay_cnvc_cursor =3D 6, - .dispclk_delay_subtotal =3D 119, - .dynamic_metadata_vm_enabled =3D false, - .odm_combine_4to1_supported =3D false, - .dcc_supported =3D true, -}; - -struct _vcs_dpi_soc_bounding_box_st dcn3_1_soc =3D { - /*TODO: correct dispclk/dppclk voltage level determination*/ - .clock_limits =3D { - { - .state =3D 0, - .dispclk_mhz =3D 1200.0, - .dppclk_mhz =3D 1200.0, - .phyclk_mhz =3D 600.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 186.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 1, - .dispclk_mhz =3D 1200.0, - .dppclk_mhz =3D 1200.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 209.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 2, - .dispclk_mhz =3D 1200.0, - .dppclk_mhz =3D 1200.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 209.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 3, - .dispclk_mhz =3D 1200.0, - .dppclk_mhz =3D 1200.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 371.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 4, - .dispclk_mhz =3D 1200.0, - .dppclk_mhz =3D 1200.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 625.0, - }, - }, - .num_states =3D 5, - .sr_exit_time_us =3D 9.0, - .sr_enter_plus_exit_time_us =3D 11.0, - .sr_exit_z8_time_us =3D 442.0, - .sr_enter_plus_exit_z8_time_us =3D 560.0, - .writeback_latency_us =3D 12.0, - .dram_channel_width_bytes =3D 4, - .round_trip_ping_latency_dcfclk_cycles =3D 106, - .urgent_latency_pixel_data_only_us =3D 4.0, - .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, - .urgent_latency_vm_data_only_us =3D 4.0, - .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, - .pct_ideal_sdp_bw_after_urgent =3D 80.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 65.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 60.0, - .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 30.0, - .max_avg_sdp_bw_use_normal_percent =3D 60.0, - .max_avg_dram_bw_use_normal_percent =3D 60.0, - .fabric_datapath_to_dcn_data_return_bytes =3D 32, - .return_bus_width_bytes =3D 64, - .downspread_percent =3D 0.38, - .dcn_downspread_percent =3D 0.5, - .gpuvm_min_page_size_bytes =3D 4096, - .hostvm_min_page_size_bytes =3D 4096, - .do_urgent_latency_adjustment =3D false, - .urgent_latency_adjustment_fabric_clock_component_us =3D 0, - .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, -}; - enum dcn31_clk_src_array_id { DCN31_CLK_SRC_PLL0, DCN31_CLK_SRC_PLL1, @@ -1869,143 +1724,6 @@ void dcn31_update_soc_for_wm_a(struct dc *dc, struc= t dc_state *context) } } =20 -static void dcn31_calculate_wm_and_dlg_fp( - struct dc *dc, struct dc_state *context, - display_e2e_pipe_params_st *pipes, - int pipe_cnt, - int vlevel) -{ - int i, pipe_idx; - double dcfclk =3D context->bw_ctx.dml.vba.DCFCLKState[vlevel][context->bw= _ctx.dml.vba.maxMpcComb]; - - if (context->bw_ctx.dml.soc.min_dcfclk > dcfclk) - dcfclk =3D context->bw_ctx.dml.soc.min_dcfclk; - - /* We don't recalculate clocks for 0 pipe configs, which can block - * S0i3 as high clocks will block low power states - * Override any clocks that can block S0i3 to min here - */ - if (pipe_cnt =3D=3D 0) { - context->bw_ctx.bw.dcn.clk.dcfclk_khz =3D dcfclk; // always should be vl= evel 0 - return; - } - - pipes[0].clks_cfg.voltage =3D vlevel; - pipes[0].clks_cfg.dcfclk_mhz =3D dcfclk; - pipes[0].clks_cfg.socclk_mhz =3D context->bw_ctx.dml.soc.clock_limits[vle= vel].socclk_mhz; - -#if 0 // TODO - /* Set B: - * TODO - */ - if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_B].valid) { - if (vlevel =3D=3D 0) { - pipes[0].clks_cfg.voltage =3D 1; - pipes[0].clks_cfg.dcfclk_mhz =3D context->bw_ctx.dml.soc.clock_limits[0= ].dcfclk_mhz; - } - context->bw_ctx.dml.soc.dram_clock_change_latency_us =3D dc->clk_mgr->bw= _params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us; - context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us =3D dc->clk_mgr->bw_p= arams->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us; - context->bw_ctx.dml.soc.sr_exit_time_us =3D dc->clk_mgr->bw_params->wm_t= able.nv_entries[WM_B].dml_input.sr_exit_time_us; - } - context->bw_ctx.bw.dcn.watermarks.b.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; - context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.b.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - - pipes[0].clks_cfg.voltage =3D vlevel; - pipes[0].clks_cfg.dcfclk_mhz =3D dcfclk; - - /* Set C: - * TODO - */ - if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_C].valid) { - context->bw_ctx.dml.soc.dram_clock_change_latency_us =3D dc->clk_mgr->bw= _params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us; - context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us =3D dc->clk_mgr->bw_p= arams->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us; - context->bw_ctx.dml.soc.sr_exit_time_us =3D dc->clk_mgr->bw_params->wm_t= able.nv_entries[WM_C].dml_input.sr_exit_time_us; - } - context->bw_ctx.bw.dcn.watermarks.c.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; - context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.c.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - - /* Set D: - * TODO - */ - if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_D].valid) { - context->bw_ctx.dml.soc.dram_clock_change_latency_us =3D dc->clk_mgr->bw= _params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us; - context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us =3D dc->clk_mgr->bw_p= arams->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us; - context->bw_ctx.dml.soc.sr_exit_time_us =3D dc->clk_mgr->bw_params->wm_t= able.nv_entries[WM_D].dml_input.sr_exit_time_us; - } - context->bw_ctx.bw.dcn.watermarks.d.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; - context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.d.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; -#endif - - /* Set A: - * All clocks min required - * - * Set A calculated last so that following calculations are based on Set A - */ - dc->res_pool->funcs->update_soc_for_wm_a(dc, context); - context->bw_ctx.bw.dcn.watermarks.a.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; - context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - context->bw_ctx.bw.dcn.watermarks.a.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; - /* TODO: remove: */ - context->bw_ctx.bw.dcn.watermarks.b =3D context->bw_ctx.bw.dcn.watermarks= .a; - context->bw_ctx.bw.dcn.watermarks.c =3D context->bw_ctx.bw.dcn.watermarks= .a; - context->bw_ctx.bw.dcn.watermarks.d =3D context->bw_ctx.bw.dcn.watermarks= .a; - /* end remove*/ - - for (i =3D 0, pipe_idx =3D 0; i < dc->res_pool->pipe_count; i++) { - if (!context->res_ctx.pipe_ctx[i].stream) - continue; - - pipes[pipe_idx].clks_cfg.dispclk_mhz =3D get_dispclk_calculated(&context= ->bw_ctx.dml, pipes, pipe_cnt); - pipes[pipe_idx].clks_cfg.dppclk_mhz =3D get_dppclk_calculated(&context->= bw_ctx.dml, pipes, pipe_cnt, pipe_idx); - - if (dc->config.forced_clocks || dc->debug.max_disp_clk) { - pipes[pipe_idx].clks_cfg.dispclk_mhz =3D context->bw_ctx.dml.soc.clock_= limits[0].dispclk_mhz; - pipes[pipe_idx].clks_cfg.dppclk_mhz =3D context->bw_ctx.dml.soc.clock_l= imits[0].dppclk_mhz; - } - if (dc->debug.min_disp_clk_khz > pipes[pipe_idx].clks_cfg.dispclk_mhz * = 1000) - pipes[pipe_idx].clks_cfg.dispclk_mhz =3D dc->debug.min_disp_clk_khz / 1= 000.0; - if (dc->debug.min_dpp_clk_khz > pipes[pipe_idx].clks_cfg.dppclk_mhz * 10= 00) - pipes[pipe_idx].clks_cfg.dppclk_mhz =3D dc->debug.min_dpp_clk_khz / 100= 0.0; - - pipe_idx++; - } - - DC_FP_START(); - dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); - DC_FP_END(); -} - void dcn31_calculate_wm_and_dlg( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, @@ -2073,77 +1791,6 @@ static struct dc_cap_funcs cap_funcs =3D { .get_dcc_compression_cap =3D dcn20_get_dcc_compression_cap }; =20 -void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params) -{ - struct clk_limit_table *clk_table =3D &bw_params->clk_table; - struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; - unsigned int i, closest_clk_lvl; - int j; - - // Default clock levels are used for diags, which may lead to overclockin= g. - if (!IS_DIAG_DC(dc->ctx->dce_environment)) { - int max_dispclk_mhz =3D 0, max_dppclk_mhz =3D 0; - - dcn3_1_ip.max_num_otg =3D dc->res_pool->res_cap->num_timing_generator; - dcn3_1_ip.max_num_dpp =3D dc->res_pool->pipe_count; - dcn3_1_soc.num_chans =3D bw_params->num_channels; - - ASSERT(clk_table->num_entries); - - /* Prepass to find max clocks independent of voltage level. */ - for (i =3D 0; i < clk_table->num_entries; ++i) { - if (clk_table->entries[i].dispclk_mhz > max_dispclk_mhz) - max_dispclk_mhz =3D clk_table->entries[i].dispclk_mhz; - if (clk_table->entries[i].dppclk_mhz > max_dppclk_mhz) - max_dppclk_mhz =3D clk_table->entries[i].dppclk_mhz; - } - - for (i =3D 0; i < clk_table->num_entries; i++) { - /* loop backwards*/ - for (closest_clk_lvl =3D 0, j =3D dcn3_1_soc.num_states - 1; j >=3D 0; = j--) { - if ((unsigned int) dcn3_1_soc.clock_limits[j].dcfclk_mhz <=3D clk_tabl= e->entries[i].dcfclk_mhz) { - closest_clk_lvl =3D j; - break; - } - } - - clock_limits[i].state =3D i; - - /* Clocks dependent on voltage level. */ - clock_limits[i].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; - clock_limits[i].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; - clock_limits[i].socclk_mhz =3D clk_table->entries[i].socclk_mhz; - clock_limits[i].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2= * clk_table->entries[i].wck_ratio; - - /* Clocks independent of voltage level. */ - clock_limits[i].dispclk_mhz =3D max_dispclk_mhz ? max_dispclk_mhz : - dcn3_1_soc.clock_limits[closest_clk_lvl].dispclk_mhz; - - clock_limits[i].dppclk_mhz =3D max_dppclk_mhz ? max_dppclk_mhz : - dcn3_1_soc.clock_limits[closest_clk_lvl].dppclk_mhz; - - clock_limits[i].dram_bw_per_chan_gbps =3D dcn3_1_soc.clock_limits[close= st_clk_lvl].dram_bw_per_chan_gbps; - clock_limits[i].dscclk_mhz =3D dcn3_1_soc.clock_limits[closest_clk_lvl]= .dscclk_mhz; - clock_limits[i].dtbclk_mhz =3D dcn3_1_soc.clock_limits[closest_clk_lvl]= .dtbclk_mhz; - clock_limits[i].phyclk_d18_mhz =3D dcn3_1_soc.clock_limits[closest_clk_= lvl].phyclk_d18_mhz; - clock_limits[i].phyclk_mhz =3D dcn3_1_soc.clock_limits[closest_clk_lvl]= .phyclk_mhz; - } - for (i =3D 0; i < clk_table->num_entries; i++) - dcn3_1_soc.clock_limits[i] =3D clock_limits[i]; - if (clk_table->num_entries) { - dcn3_1_soc.num_states =3D clk_table->num_entries; - } - } - - dcn3_1_soc.dispclk_dppclk_vco_speed_mhz =3D dc->clk_mgr->dentist_vco_freq= _khz / 1000.0; - dc->dml.soc.dispclk_dppclk_vco_speed_mhz =3D dc->clk_mgr->dentist_vco_fre= q_khz / 1000.0; - - if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) - dml_init_instance(&dc->dml, &dcn3_1_soc, &dcn3_1_ip, DML_PROJECT_DCN31); - else - dml_init_instance(&dc->dml, &dcn3_1_soc, &dcn3_1_ip, DML_PROJECT_DCN31_F= PGA); -} - static struct resource_funcs dcn31_res_pool_funcs =3D { .destroy =3D dcn31_destroy_resource_pool, .link_enc_create =3D dcn31_link_encoder_create, diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.h b/driver= s/gpu/drm/amd/display/dc/dcn31/dcn31_resource.h index 4b7ab21ea15b..1ce6509c1ed1 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.h @@ -31,6 +31,9 @@ #define TO_DCN31_RES_POOL(pool)\ container_of(pool, struct dcn31_resource_pool, base) =20 +extern struct _vcs_dpi_ip_params_st dcn3_1_ip; +extern struct _vcs_dpi_soc_bounding_box_st dcn3_1_soc; + struct dcn31_resource_pool { struct resource_pool base; }; @@ -47,7 +50,6 @@ int dcn31_populate_dml_pipes_from_context( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, bool fast_validate); -void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params); void dcn31_update_soc_for_wm_a(struct dc *dc, struct dc_state *context); =20 struct resource_pool *dcn31_create_resource_pool( diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/= amd/display/dc/dml/Makefile index 28978ce62f87..ee911452c048 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile @@ -71,6 +71,7 @@ CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_mode_vba_30.o := =3D $(dml_ccflags) $(fram CFLAGS_$(AMDDALPATH)/dc/dml/dcn30/display_rq_dlg_calc_30.o :=3D $(dml_ccfl= ags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn31/display_mode_vba_31.o :=3D $(dml_ccflags= ) $(frame_warn_flag) CFLAGS_$(AMDDALPATH)/dc/dml/dcn31/display_rq_dlg_calc_31.o :=3D $(dml_ccfl= ags) +CFLAGS_$(AMDDALPATH)/dc/dml/dcn31/dcn31_fpu.o :=3D $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn301/dcn301_fpu.o :=3D $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn302/dcn302_fpu.o :=3D $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn303/dcn303_fpu.o :=3D $(dml_ccflags) @@ -114,6 +115,7 @@ DML +=3D dcn20/display_rq_dlg_calc_20v2.o dcn20/display= _mode_vba_20v2.o DML +=3D dcn21/display_rq_dlg_calc_21.o dcn21/display_mode_vba_21.o DML +=3D dcn30/display_mode_vba_30.o dcn30/display_rq_dlg_calc_30.o DML +=3D dcn31/display_mode_vba_31.o dcn31/display_rq_dlg_calc_31.o +DML +=3D dcn31/dcn31_fpu.o DML +=3D dcn301/dcn301_fpu.o DML +=3D dcn302/dcn302_fpu.o DML +=3D dcn303/dcn303_fpu.o diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers= /gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c new file mode 100644 index 000000000000..7ff8fe9e8712 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c @@ -0,0 +1,406 @@ +/* + * Copyright 2019-2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software= "), + * to deal in the Software without restriction, including without limitati= on + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "resource.h" +#include "clk_mgr.h" + +#include "dml/dcn20/dcn20_fpu.h" +#include "dcn31_fpu.h" + +/** + * DOC: DCN31x FPU manipulation Overview + * + * The DCN architecture relies on FPU operations, which require special + * compilation flags and the use of kernel_fpu_begin/end functions; ideall= y, we + * want to avoid spreading FPU access across multiple files. With this ide= a in + * mind, this file aims to centralize all DCN3.1.x functions that require = FPU + * access in a single place. Code in this file follows the following code + * pattern: + * + * 1. Functions that use FPU operations should be isolated in static funct= ions. + * 2. The FPU functions should have the noinline attribute to ensure anyth= ing + * that deals with FP register is contained within this call. + * 3. All function that needs to be accessed outside this file requires a + * public interface that not uses any FPU reference. + * 4. Developers **must not** use DC_FP_START/END in this file, but they n= eed + * to ensure that the caller invokes it before access any function avai= lable + * in this file. For this reason, public functions in this file must in= voke + * dc_assert_fp_enabled(); + */ + +struct _vcs_dpi_ip_params_st dcn3_1_ip =3D { + .gpuvm_enable =3D 1, + .gpuvm_max_page_table_levels =3D 1, + .hostvm_enable =3D 1, + .hostvm_max_page_table_levels =3D 2, + .rob_buffer_size_kbytes =3D 64, + .det_buffer_size_kbytes =3D DCN3_1_DEFAULT_DET_SIZE, + .config_return_buffer_size_in_kbytes =3D 1792, + .compressed_buffer_segment_size_in_kbytes =3D 64, + .meta_fifo_size_in_kentries =3D 32, + .zero_size_buffer_entries =3D 512, + .compbuf_reserved_space_64b =3D 256, + .compbuf_reserved_space_zs =3D 64, + .dpp_output_buffer_pixels =3D 2560, + .opp_output_buffer_lines =3D 1, + .pixel_chunk_size_kbytes =3D 8, + .meta_chunk_size_kbytes =3D 2, + .min_meta_chunk_size_bytes =3D 256, + .writeback_chunk_size_kbytes =3D 8, + .ptoi_supported =3D false, + .num_dsc =3D 3, + .maximum_dsc_bits_per_component =3D 10, + .dsc422_native_support =3D false, + .is_line_buffer_bpp_fixed =3D true, + .line_buffer_fixed_bpp =3D 48, + .line_buffer_size_bits =3D 789504, + .max_line_buffer_lines =3D 12, + .writeback_interface_buffer_size_kbytes =3D 90, + .max_num_dpp =3D 4, + .max_num_otg =3D 4, + .max_num_hdmi_frl_outputs =3D 1, + .max_num_wb =3D 1, + .max_dchub_pscl_bw_pix_per_clk =3D 4, + .max_pscl_lb_bw_pix_per_clk =3D 2, + .max_lb_vscl_bw_pix_per_clk =3D 4, + .max_vscl_hscl_bw_pix_per_clk =3D 4, + .max_hscl_ratio =3D 6, + .max_vscl_ratio =3D 6, + .max_hscl_taps =3D 8, + .max_vscl_taps =3D 8, + .dpte_buffer_size_in_pte_reqs_luma =3D 64, + .dpte_buffer_size_in_pte_reqs_chroma =3D 34, + .dispclk_ramp_margin_percent =3D 1, + .max_inter_dcn_tile_repeaters =3D 8, + .cursor_buffer_size =3D 16, + .cursor_chunk_size =3D 2, + .writeback_line_buffer_buffer_size =3D 0, + .writeback_min_hscl_ratio =3D 1, + .writeback_min_vscl_ratio =3D 1, + .writeback_max_hscl_ratio =3D 1, + .writeback_max_vscl_ratio =3D 1, + .writeback_max_hscl_taps =3D 1, + .writeback_max_vscl_taps =3D 1, + .dppclk_delay_subtotal =3D 46, + .dppclk_delay_scl =3D 50, + .dppclk_delay_scl_lb_only =3D 16, + .dppclk_delay_cnvc_formatter =3D 27, + .dppclk_delay_cnvc_cursor =3D 6, + .dispclk_delay_subtotal =3D 119, + .dynamic_metadata_vm_enabled =3D false, + .odm_combine_4to1_supported =3D false, + .dcc_supported =3D true, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn3_1_soc =3D { + /*TODO: correct dispclk/dppclk voltage level determination*/ + .clock_limits =3D { + { + .state =3D 0, + .dispclk_mhz =3D 1200.0, + .dppclk_mhz =3D 1200.0, + .phyclk_mhz =3D 600.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 186.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 1, + .dispclk_mhz =3D 1200.0, + .dppclk_mhz =3D 1200.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 209.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 2, + .dispclk_mhz =3D 1200.0, + .dppclk_mhz =3D 1200.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 209.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 3, + .dispclk_mhz =3D 1200.0, + .dppclk_mhz =3D 1200.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 371.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 4, + .dispclk_mhz =3D 1200.0, + .dppclk_mhz =3D 1200.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 625.0, + }, + }, + .num_states =3D 5, + .sr_exit_time_us =3D 9.0, + .sr_enter_plus_exit_time_us =3D 11.0, + .sr_exit_z8_time_us =3D 442.0, + .sr_enter_plus_exit_z8_time_us =3D 560.0, + .writeback_latency_us =3D 12.0, + .dram_channel_width_bytes =3D 4, + .round_trip_ping_latency_dcfclk_cycles =3D 106, + .urgent_latency_pixel_data_only_us =3D 4.0, + .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, + .urgent_latency_vm_data_only_us =3D 4.0, + .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, + .pct_ideal_sdp_bw_after_urgent =3D 80.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 65.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 60.0, + .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 30.0, + .max_avg_sdp_bw_use_normal_percent =3D 60.0, + .max_avg_dram_bw_use_normal_percent =3D 60.0, + .fabric_datapath_to_dcn_data_return_bytes =3D 32, + .return_bus_width_bytes =3D 64, + .downspread_percent =3D 0.38, + .dcn_downspread_percent =3D 0.5, + .gpuvm_min_page_size_bytes =3D 4096, + .hostvm_min_page_size_bytes =3D 4096, + .do_urgent_latency_adjustment =3D false, + .urgent_latency_adjustment_fabric_clock_component_us =3D 0, + .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, +}; + + +void dcn31_calculate_wm_and_dlg_fp( + struct dc *dc, struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int pipe_cnt, + int vlevel) +{ + int i, pipe_idx; + double dcfclk =3D context->bw_ctx.dml.vba.DCFCLKState[vlevel][context->bw= _ctx.dml.vba.maxMpcComb]; + + dc_assert_fp_enabled(); + + if (context->bw_ctx.dml.soc.min_dcfclk > dcfclk) + dcfclk =3D context->bw_ctx.dml.soc.min_dcfclk; + + /* We don't recalculate clocks for 0 pipe configs, which can block + * S0i3 as high clocks will block low power states + * Override any clocks that can block S0i3 to min here + */ + if (pipe_cnt =3D=3D 0) { + context->bw_ctx.bw.dcn.clk.dcfclk_khz =3D dcfclk; // always should be vl= evel 0 + return; + } + + pipes[0].clks_cfg.voltage =3D vlevel; + pipes[0].clks_cfg.dcfclk_mhz =3D dcfclk; + pipes[0].clks_cfg.socclk_mhz =3D context->bw_ctx.dml.soc.clock_limits[vle= vel].socclk_mhz; + +#if 0 // TODO + /* Set B: + * TODO + */ + if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_B].valid) { + if (vlevel =3D=3D 0) { + pipes[0].clks_cfg.voltage =3D 1; + pipes[0].clks_cfg.dcfclk_mhz =3D context->bw_ctx.dml.soc.clock_limits[0= ].dcfclk_mhz; + } + context->bw_ctx.dml.soc.dram_clock_change_latency_us =3D dc->clk_mgr->bw= _params->wm_table.nv_entries[WM_B].dml_input.pstate_latency_us; + context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us =3D dc->clk_mgr->bw_p= arams->wm_table.nv_entries[WM_B].dml_input.sr_enter_plus_exit_time_us; + context->bw_ctx.dml.soc.sr_exit_time_us =3D dc->clk_mgr->bw_params->wm_t= able.nv_entries[WM_B].dml_input.sr_exit_time_us; + } + context->bw_ctx.bw.dcn.watermarks.b.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; + context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.b.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + + pipes[0].clks_cfg.voltage =3D vlevel; + pipes[0].clks_cfg.dcfclk_mhz =3D dcfclk; + + /* Set C: + * TODO + */ + if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_C].valid) { + context->bw_ctx.dml.soc.dram_clock_change_latency_us =3D dc->clk_mgr->bw= _params->wm_table.nv_entries[WM_C].dml_input.pstate_latency_us; + context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us =3D dc->clk_mgr->bw_p= arams->wm_table.nv_entries[WM_C].dml_input.sr_enter_plus_exit_time_us; + context->bw_ctx.dml.soc.sr_exit_time_us =3D dc->clk_mgr->bw_params->wm_t= able.nv_entries[WM_C].dml_input.sr_exit_time_us; + } + context->bw_ctx.bw.dcn.watermarks.c.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; + context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.c.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + + /* Set D: + * TODO + */ + if (dc->clk_mgr->bw_params->wm_table.nv_entries[WM_D].valid) { + context->bw_ctx.dml.soc.dram_clock_change_latency_us =3D dc->clk_mgr->bw= _params->wm_table.nv_entries[WM_D].dml_input.pstate_latency_us; + context->bw_ctx.dml.soc.sr_enter_plus_exit_time_us =3D dc->clk_mgr->bw_p= arams->wm_table.nv_entries[WM_D].dml_input.sr_enter_plus_exit_time_us; + context->bw_ctx.dml.soc.sr_exit_time_us =3D dc->clk_mgr->bw_params->wm_t= able.nv_entries[WM_D].dml_input.sr_exit_time_us; + } + context->bw_ctx.bw.dcn.watermarks.d.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; + context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.d.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; +#endif + + /* Set A: + * All clocks min required + * + * Set A calculated last so that following calculations are based on Set A + */ + dc->res_pool->funcs->update_soc_for_wm_a(dc, context); + context->bw_ctx.bw.dcn.watermarks.a.urgent_ns =3D get_wm_urgent(&context-= >bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_enter_plus_exit_= ns =3D get_wm_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1= 000; + context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_exit_ns =3D get_= wm_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.pstate_change_ns =3D ge= t_wm_dram_clock_change(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_enter_plus_exit_= z8_ns =3D get_wm_z8_stutter_enter_exit(&context->bw_ctx.dml, pipes, pipe_cn= t) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.cstate_pstate.cstate_exit_z8_ns =3D g= et_wm_z8_stutter_exit(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.pte_meta_urgent_ns =3D get_wm_memory_= trip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.frac_urg_bw_nom =3D get_fraction_of_u= rgent_bandwidth(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.frac_urg_bw_flip =3D get_fraction_of_= urgent_bandwidth_imm_flip(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + context->bw_ctx.bw.dcn.watermarks.a.urgent_latency_ns =3D get_urgent_late= ncy(&context->bw_ctx.dml, pipes, pipe_cnt) * 1000; + /* TODO: remove: */ + context->bw_ctx.bw.dcn.watermarks.b =3D context->bw_ctx.bw.dcn.watermarks= .a; + context->bw_ctx.bw.dcn.watermarks.c =3D context->bw_ctx.bw.dcn.watermarks= .a; + context->bw_ctx.bw.dcn.watermarks.d =3D context->bw_ctx.bw.dcn.watermarks= .a; + /* end remove*/ + + for (i =3D 0, pipe_idx =3D 0; i < dc->res_pool->pipe_count; i++) { + if (!context->res_ctx.pipe_ctx[i].stream) + continue; + + pipes[pipe_idx].clks_cfg.dispclk_mhz =3D get_dispclk_calculated(&context= ->bw_ctx.dml, pipes, pipe_cnt); + pipes[pipe_idx].clks_cfg.dppclk_mhz =3D get_dppclk_calculated(&context->= bw_ctx.dml, pipes, pipe_cnt, pipe_idx); + + if (dc->config.forced_clocks || dc->debug.max_disp_clk) { + pipes[pipe_idx].clks_cfg.dispclk_mhz =3D context->bw_ctx.dml.soc.clock_= limits[0].dispclk_mhz; + pipes[pipe_idx].clks_cfg.dppclk_mhz =3D context->bw_ctx.dml.soc.clock_l= imits[0].dppclk_mhz; + } + if (dc->debug.min_disp_clk_khz > pipes[pipe_idx].clks_cfg.dispclk_mhz * = 1000) + pipes[pipe_idx].clks_cfg.dispclk_mhz =3D dc->debug.min_disp_clk_khz / 1= 000.0; + if (dc->debug.min_dpp_clk_khz > pipes[pipe_idx].clks_cfg.dppclk_mhz * 10= 00) + pipes[pipe_idx].clks_cfg.dppclk_mhz =3D dc->debug.min_dpp_clk_khz / 100= 0.0; + + pipe_idx++; + } + + dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); +} + +void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params) +{ + struct clk_limit_table *clk_table =3D &bw_params->clk_table; + struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; + unsigned int i, closest_clk_lvl; + int j; + + dc_assert_fp_enabled(); + + // Default clock levels are used for diags, which may lead to overclockin= g. + if (!IS_DIAG_DC(dc->ctx->dce_environment)) { + int max_dispclk_mhz =3D 0, max_dppclk_mhz =3D 0; + + dcn3_1_ip.max_num_otg =3D dc->res_pool->res_cap->num_timing_generator; + dcn3_1_ip.max_num_dpp =3D dc->res_pool->pipe_count; + dcn3_1_soc.num_chans =3D bw_params->num_channels; + + ASSERT(clk_table->num_entries); + + /* Prepass to find max clocks independent of voltage level. */ + for (i =3D 0; i < clk_table->num_entries; ++i) { + if (clk_table->entries[i].dispclk_mhz > max_dispclk_mhz) + max_dispclk_mhz =3D clk_table->entries[i].dispclk_mhz; + if (clk_table->entries[i].dppclk_mhz > max_dppclk_mhz) + max_dppclk_mhz =3D clk_table->entries[i].dppclk_mhz; + } + + for (i =3D 0; i < clk_table->num_entries; i++) { + /* loop backwards*/ + for (closest_clk_lvl =3D 0, j =3D dcn3_1_soc.num_states - 1; j >=3D 0; = j--) { + if ((unsigned int) dcn3_1_soc.clock_limits[j].dcfclk_mhz <=3D clk_tabl= e->entries[i].dcfclk_mhz) { + closest_clk_lvl =3D j; + break; + } + } + + clock_limits[i].state =3D i; + + /* Clocks dependent on voltage level. */ + clock_limits[i].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; + clock_limits[i].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; + clock_limits[i].socclk_mhz =3D clk_table->entries[i].socclk_mhz; + clock_limits[i].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2= * clk_table->entries[i].wck_ratio; + + /* Clocks independent of voltage level. */ + clock_limits[i].dispclk_mhz =3D max_dispclk_mhz ? max_dispclk_mhz : + dcn3_1_soc.clock_limits[closest_clk_lvl].dispclk_mhz; + + clock_limits[i].dppclk_mhz =3D max_dppclk_mhz ? max_dppclk_mhz : + dcn3_1_soc.clock_limits[closest_clk_lvl].dppclk_mhz; + + clock_limits[i].dram_bw_per_chan_gbps =3D dcn3_1_soc.clock_limits[close= st_clk_lvl].dram_bw_per_chan_gbps; + clock_limits[i].dscclk_mhz =3D dcn3_1_soc.clock_limits[closest_clk_lvl]= .dscclk_mhz; + clock_limits[i].dtbclk_mhz =3D dcn3_1_soc.clock_limits[closest_clk_lvl]= .dtbclk_mhz; + clock_limits[i].phyclk_d18_mhz =3D dcn3_1_soc.clock_limits[closest_clk_= lvl].phyclk_d18_mhz; + clock_limits[i].phyclk_mhz =3D dcn3_1_soc.clock_limits[closest_clk_lvl]= .phyclk_mhz; + } + for (i =3D 0; i < clk_table->num_entries; i++) + dcn3_1_soc.clock_limits[i] =3D clock_limits[i]; + if (clk_table->num_entries) { + dcn3_1_soc.num_states =3D clk_table->num_entries; + } + } + + dcn3_1_soc.dispclk_dppclk_vco_speed_mhz =3D dc->clk_mgr->dentist_vco_freq= _khz / 1000.0; + dc->dml.soc.dispclk_dppclk_vco_speed_mhz =3D dc->clk_mgr->dentist_vco_fre= q_khz / 1000.0; + + if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) + dml_init_instance(&dc->dml, &dcn3_1_soc, &dcn3_1_ip, DML_PROJECT_DCN31); + else + dml_init_instance(&dc->dml, &dcn3_1_soc, &dcn3_1_ip, DML_PROJECT_DCN31_F= PGA); +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers= /gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h new file mode 100644 index 000000000000..baadb5150e7d --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h @@ -0,0 +1,39 @@ +/* + * Copyright 2019-2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software= "), + * to deal in the Software without restriction, including without limitati= on + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#ifndef __DCN31_FPU_H__ +#define __DCN31_FPU_H__ + +#define DCN3_1_DEFAULT_DET_SIZE 384 + +void dcn31_calculate_wm_and_dlg_fp( + struct dc *dc, struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int pipe_cnt, + int vlevel); + +void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params); + +#endif /* __DCN31_FPU_H__*/ --=20 2.34.1 From nobody Tue Jun 23 10:12:34 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4969C433F5 for ; Mon, 7 Mar 2022 15:48:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243938AbiCGPtn (ORCPT ); Mon, 7 Mar 2022 10:49:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243921AbiCGPtZ (ORCPT ); Mon, 7 Mar 2022 10:49:25 -0500 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BFE6652CC for ; Mon, 7 Mar 2022 07:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=SUYyjzJa+Ccqw8POIwaRZM01873mllmmbE+fQkBB8Mk=; b=eMIVIH1+qSZtUy81y54xdnprS1 /EdxKG82Th/BKZ2WZNVTRFzDlCmGKvPMqwTTUL4AGVx6YrqVndX6A2plLio91WFMgHbWTy+hoE4RW H6qjTYR9rlZdx1eDd5hQlG/1RHOQ00iHZmkntL6ZAKRJyMBWzsPBTlUcu8y/ZBxhGuZi84N4SyNOt 9ulRXrp/LCh/jPllNPsRpTCgf7lQmR/lHBOlibCLglKemeFWuV2yofCzD9Eq9jrU0kovzD4phRtPh Jt8fa8fsfy0BnOvfFb8iCeHHGKvz5R4IXM/MZIIC1Tl0DTECjvhlpBm62XhecoERqfBEd5A8RUa10 wpJMEIWA==; Received: from [192.168.12.191] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nRFaw-0002rG-OK; Mon, 07 Mar 2022 16:48:22 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Cc: Dmytro Laktyushkin , Jasdeep Dhillon , Qingqing Zhuo , Melissa Wen , linux-kernel@vger.kernel.org Subject: [PATCH 2/3] drm/amd/display: move FPU related code from dcn315 to dml/dcn31 folder Date: Mon, 7 Mar 2022 14:48:00 -0100 Message-Id: <20220307154801.2196284-3-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220307154801.2196284-1-mwen@igalia.com> References: <20220307154801.2196284-1-mwen@igalia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Moves related structs and dcn315_update_bw_bounding_box from dcn315 driver code to dml/dcn31_fpu that centralizes FPU code for DCN 3.1x. Signed-off-by: Melissa Wen --- .../gpu/drm/amd/display/dc/dcn315/Makefile | 26 -- .../amd/display/dc/dcn315/dcn315_resource.c | 232 +----------------- .../amd/display/dc/dcn315/dcn315_resource.h | 3 + .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 228 +++++++++++++++++ .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 3 + 5 files changed, 235 insertions(+), 257 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile b/drivers/gpu/d= rm/amd/display/dc/dcn315/Makefile index c831ad46e81c..59381d24800b 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn315/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dcn315/Makefile @@ -25,32 +25,6 @@ =20 DCN315 =3D dcn315_resource.o =20 -ifdef CONFIG_X86 -CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o :=3D -msse -endif - -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o :=3D -mhard-float -maltiv= ec -endif - -ifdef CONFIG_CC_IS_GCC -ifeq ($(call cc-ifversion, -lt, 0701, y), y) -IS_OLD_GCC =3D 1 -endif -CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o +=3D -mhard-float -endif - -ifdef CONFIG_X86 -ifdef IS_OLD_GCC -# Stack alignment mismatch, proceed with caution. -# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-bound= ary=3D3 -# (8B stack alignment). -CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o +=3D -mpreferred-stack-bo= undary=3D4 -else -CFLAGS_$(AMDDALPATH)/dc/dcn315/dcn315_resource.o +=3D -msse2 -endif -endif - AMD_DAL_DCN315 =3D $(addprefix $(AMDDALPATH)/dc/dcn315/,$(DCN315)) =20 AMD_DISPLAY_FILES +=3D $(AMD_DAL_DCN315) diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c b/driv= ers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c index 756fec81b9ad..51a712958dbd 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c @@ -66,6 +66,7 @@ #include "virtual/virtual_stream_encoder.h" #include "dce110/dce110_resource.h" #include "dml/display_mode_vba.h" +#include "dml/dcn31/dcn31_fpu.h" #include "dcn31/dcn31_dccg.h" #include "dcn10/dcn10_resource.h" #include "dcn31/dcn31_panel_cntl.h" @@ -133,158 +134,9 @@ =20 #include "link_enc_cfg.h" =20 -#define DC_LOGGER_INIT(logger) - -#define DCN3_15_DEFAULT_DET_SIZE 192 #define DCN3_15_MAX_DET_SIZE 384 -#define DCN3_15_MIN_COMPBUF_SIZE_KB 128 #define DCN3_15_CRB_SEGMENT_SIZE_KB 64 =20 -struct _vcs_dpi_ip_params_st dcn3_15_ip =3D { - .gpuvm_enable =3D 1, - .gpuvm_max_page_table_levels =3D 1, - .hostvm_enable =3D 1, - .hostvm_max_page_table_levels =3D 2, - .rob_buffer_size_kbytes =3D 64, - .det_buffer_size_kbytes =3D DCN3_15_DEFAULT_DET_SIZE, - .min_comp_buffer_size_kbytes =3D DCN3_15_MIN_COMPBUF_SIZE_KB, - .config_return_buffer_size_in_kbytes =3D 1024, - .compressed_buffer_segment_size_in_kbytes =3D 64, - .meta_fifo_size_in_kentries =3D 32, - .zero_size_buffer_entries =3D 512, - .compbuf_reserved_space_64b =3D 256, - .compbuf_reserved_space_zs =3D 64, - .dpp_output_buffer_pixels =3D 2560, - .opp_output_buffer_lines =3D 1, - .pixel_chunk_size_kbytes =3D 8, - .meta_chunk_size_kbytes =3D 2, - .min_meta_chunk_size_bytes =3D 256, - .writeback_chunk_size_kbytes =3D 8, - .ptoi_supported =3D false, - .num_dsc =3D 3, - .maximum_dsc_bits_per_component =3D 10, - .dsc422_native_support =3D false, - .is_line_buffer_bpp_fixed =3D true, - .line_buffer_fixed_bpp =3D 49, - .line_buffer_size_bits =3D 789504, - .max_line_buffer_lines =3D 12, - .writeback_interface_buffer_size_kbytes =3D 90, - .max_num_dpp =3D 4, - .max_num_otg =3D 4, - .max_num_hdmi_frl_outputs =3D 1, - .max_num_wb =3D 1, - .max_dchub_pscl_bw_pix_per_clk =3D 4, - .max_pscl_lb_bw_pix_per_clk =3D 2, - .max_lb_vscl_bw_pix_per_clk =3D 4, - .max_vscl_hscl_bw_pix_per_clk =3D 4, - .max_hscl_ratio =3D 6, - .max_vscl_ratio =3D 6, - .max_hscl_taps =3D 8, - .max_vscl_taps =3D 8, - .dpte_buffer_size_in_pte_reqs_luma =3D 64, - .dpte_buffer_size_in_pte_reqs_chroma =3D 34, - .dispclk_ramp_margin_percent =3D 1, - .max_inter_dcn_tile_repeaters =3D 9, - .cursor_buffer_size =3D 16, - .cursor_chunk_size =3D 2, - .writeback_line_buffer_buffer_size =3D 0, - .writeback_min_hscl_ratio =3D 1, - .writeback_min_vscl_ratio =3D 1, - .writeback_max_hscl_ratio =3D 1, - .writeback_max_vscl_ratio =3D 1, - .writeback_max_hscl_taps =3D 1, - .writeback_max_vscl_taps =3D 1, - .dppclk_delay_subtotal =3D 46, - .dppclk_delay_scl =3D 50, - .dppclk_delay_scl_lb_only =3D 16, - .dppclk_delay_cnvc_formatter =3D 27, - .dppclk_delay_cnvc_cursor =3D 6, - .dispclk_delay_subtotal =3D 119, - .dynamic_metadata_vm_enabled =3D false, - .odm_combine_4to1_supported =3D false, - .dcc_supported =3D true, -}; - -struct _vcs_dpi_soc_bounding_box_st dcn3_15_soc =3D { - /*TODO: correct dispclk/dppclk voltage level determination*/ - .clock_limits =3D { - { - .state =3D 0, - .dispclk_mhz =3D 1372.0, - .dppclk_mhz =3D 1372.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 600.0, - }, - { - .state =3D 1, - .dispclk_mhz =3D 1372.0, - .dppclk_mhz =3D 1372.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 600.0, - }, - { - .state =3D 2, - .dispclk_mhz =3D 1372.0, - .dppclk_mhz =3D 1372.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 600.0, - }, - { - .state =3D 3, - .dispclk_mhz =3D 1372.0, - .dppclk_mhz =3D 1372.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 600.0, - }, - { - .state =3D 4, - .dispclk_mhz =3D 1372.0, - .dppclk_mhz =3D 1372.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 600.0, - }, - }, - .num_states =3D 5, - .sr_exit_time_us =3D 9.0, - .sr_enter_plus_exit_time_us =3D 11.0, - .sr_exit_z8_time_us =3D 50.0, - .sr_enter_plus_exit_z8_time_us =3D 50.0, - .writeback_latency_us =3D 12.0, - .dram_channel_width_bytes =3D 4, - .round_trip_ping_latency_dcfclk_cycles =3D 106, - .urgent_latency_pixel_data_only_us =3D 4.0, - .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, - .urgent_latency_vm_data_only_us =3D 4.0, - .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, - .pct_ideal_sdp_bw_after_urgent =3D 80.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 65.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 60.0, - .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 30.0, - .max_avg_sdp_bw_use_normal_percent =3D 60.0, - .max_avg_dram_bw_use_normal_percent =3D 60.0, - .fabric_datapath_to_dcn_data_return_bytes =3D 32, - .return_bus_width_bytes =3D 64, - .downspread_percent =3D 0.38, - .dcn_downspread_percent =3D 0.38, - .gpuvm_min_page_size_bytes =3D 4096, - .hostvm_min_page_size_bytes =3D 4096, - .do_urgent_latency_adjustment =3D false, - .urgent_latency_adjustment_fabric_clock_component_us =3D 0, - .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, -}; - enum dcn31_clk_src_array_id { DCN31_CLK_SRC_PLL0, DCN31_CLK_SRC_PLL1, @@ -1859,88 +1711,6 @@ static struct dc_cap_funcs cap_funcs =3D { .get_dcc_compression_cap =3D dcn20_get_dcc_compression_cap }; =20 -static void dcn315_update_bw_bounding_box(struct dc *dc, struct clk_bw_par= ams *bw_params) -{ - struct clk_limit_table *clk_table =3D &bw_params->clk_table; - struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; - unsigned int i, closest_clk_lvl; - int max_dispclk_mhz =3D 0, max_dppclk_mhz =3D 0; - int j; - - // Default clock levels are used for diags, which may lead to overclockin= g. - if (!IS_DIAG_DC(dc->ctx->dce_environment)) { - - dcn3_15_ip.max_num_otg =3D dc->res_pool->res_cap->num_timing_generator; - dcn3_15_ip.max_num_dpp =3D dc->res_pool->pipe_count; - dcn3_15_soc.num_chans =3D bw_params->num_channels; - - ASSERT(clk_table->num_entries); - - /* Prepass to find max clocks independent of voltage level. */ - for (i =3D 0; i < clk_table->num_entries; ++i) { - if (clk_table->entries[i].dispclk_mhz > max_dispclk_mhz) - max_dispclk_mhz =3D clk_table->entries[i].dispclk_mhz; - if (clk_table->entries[i].dppclk_mhz > max_dppclk_mhz) - max_dppclk_mhz =3D clk_table->entries[i].dppclk_mhz; - } - - for (i =3D 0; i < clk_table->num_entries; i++) { - /* loop backwards*/ - for (closest_clk_lvl =3D 0, j =3D dcn3_15_soc.num_states - 1; j >=3D 0;= j--) { - if ((unsigned int) dcn3_15_soc.clock_limits[j].dcfclk_mhz <=3D clk_tab= le->entries[i].dcfclk_mhz) { - closest_clk_lvl =3D j; - break; - } - } - if (clk_table->num_entries =3D=3D 1) { - /*smu gives one DPM level, let's take the highest one*/ - closest_clk_lvl =3D dcn3_15_soc.num_states - 1; - } - - clock_limits[i].state =3D i; - - /* Clocks dependent on voltage level. */ - clock_limits[i].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; - if (clk_table->num_entries =3D=3D 1 && - clock_limits[i].dcfclk_mhz < dcn3_15_soc.clock_limits[closest_clk_lvl]= .dcfclk_mhz) { - /*SMU fix not released yet*/ - clock_limits[i].dcfclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lv= l].dcfclk_mhz; - } - clock_limits[i].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; - clock_limits[i].socclk_mhz =3D clk_table->entries[i].socclk_mhz; - clock_limits[i].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2= * clk_table->entries[i].wck_ratio; - - /* Clocks independent of voltage level. */ - clock_limits[i].dispclk_mhz =3D max_dispclk_mhz ? max_dispclk_mhz : - dcn3_15_soc.clock_limits[closest_clk_lvl].dispclk_mhz; - - clock_limits[i].dppclk_mhz =3D max_dppclk_mhz ? max_dppclk_mhz : - dcn3_15_soc.clock_limits[closest_clk_lvl].dppclk_mhz; - - clock_limits[i].dram_bw_per_chan_gbps =3D dcn3_15_soc.clock_limits[clos= est_clk_lvl].dram_bw_per_chan_gbps; - clock_limits[i].dscclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lvl= ].dscclk_mhz; - clock_limits[i].dtbclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lvl= ].dtbclk_mhz; - clock_limits[i].phyclk_d18_mhz =3D dcn3_15_soc.clock_limits[closest_clk= _lvl].phyclk_d18_mhz; - clock_limits[i].phyclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lvl= ].phyclk_mhz; - } - for (i =3D 0; i < clk_table->num_entries; i++) - dcn3_15_soc.clock_limits[i] =3D clock_limits[i]; - if (clk_table->num_entries) { - dcn3_15_soc.num_states =3D clk_table->num_entries; - } - } - - if (max_dispclk_mhz) { - dcn3_15_soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; - dc->dml.soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; - } - - if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) - dml_init_instance(&dc->dml, &dcn3_15_soc, &dcn3_15_ip, DML_PROJECT_DCN31= ); - else - dml_init_instance(&dc->dml, &dcn3_15_soc, &dcn3_15_ip, DML_PROJECT_DCN31= _FPGA); -} - static struct resource_funcs dcn315_res_pool_funcs =3D { .destroy =3D dcn315_destroy_resource_pool, .link_enc_create =3D dcn31_link_encoder_create, diff --git a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.h b/driv= ers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.h index f3a36820a31f..39929fa67a51 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.h @@ -31,6 +31,9 @@ #define TO_DCN315_RES_POOL(pool)\ container_of(pool, struct dcn315_resource_pool, base) =20 +extern struct _vcs_dpi_ip_params_st dcn3_15_ip; +extern struct _vcs_dpi_ip_params_st dcn3_15_soc; + struct dcn315_resource_pool { struct resource_pool base; }; diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers= /gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c index 7ff8fe9e8712..f70b47ef850c 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c @@ -194,6 +194,150 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_1_soc =3D { .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, }; =20 +struct _vcs_dpi_ip_params_st dcn3_15_ip =3D { + .gpuvm_enable =3D 1, + .gpuvm_max_page_table_levels =3D 1, + .hostvm_enable =3D 1, + .hostvm_max_page_table_levels =3D 2, + .rob_buffer_size_kbytes =3D 64, + .det_buffer_size_kbytes =3D DCN3_15_DEFAULT_DET_SIZE, + .min_comp_buffer_size_kbytes =3D DCN3_15_MIN_COMPBUF_SIZE_KB, + .config_return_buffer_size_in_kbytes =3D 1024, + .compressed_buffer_segment_size_in_kbytes =3D 64, + .meta_fifo_size_in_kentries =3D 32, + .zero_size_buffer_entries =3D 512, + .compbuf_reserved_space_64b =3D 256, + .compbuf_reserved_space_zs =3D 64, + .dpp_output_buffer_pixels =3D 2560, + .opp_output_buffer_lines =3D 1, + .pixel_chunk_size_kbytes =3D 8, + .meta_chunk_size_kbytes =3D 2, + .min_meta_chunk_size_bytes =3D 256, + .writeback_chunk_size_kbytes =3D 8, + .ptoi_supported =3D false, + .num_dsc =3D 3, + .maximum_dsc_bits_per_component =3D 10, + .dsc422_native_support =3D false, + .is_line_buffer_bpp_fixed =3D true, + .line_buffer_fixed_bpp =3D 49, + .line_buffer_size_bits =3D 789504, + .max_line_buffer_lines =3D 12, + .writeback_interface_buffer_size_kbytes =3D 90, + .max_num_dpp =3D 4, + .max_num_otg =3D 4, + .max_num_hdmi_frl_outputs =3D 1, + .max_num_wb =3D 1, + .max_dchub_pscl_bw_pix_per_clk =3D 4, + .max_pscl_lb_bw_pix_per_clk =3D 2, + .max_lb_vscl_bw_pix_per_clk =3D 4, + .max_vscl_hscl_bw_pix_per_clk =3D 4, + .max_hscl_ratio =3D 6, + .max_vscl_ratio =3D 6, + .max_hscl_taps =3D 8, + .max_vscl_taps =3D 8, + .dpte_buffer_size_in_pte_reqs_luma =3D 64, + .dpte_buffer_size_in_pte_reqs_chroma =3D 34, + .dispclk_ramp_margin_percent =3D 1, + .max_inter_dcn_tile_repeaters =3D 9, + .cursor_buffer_size =3D 16, + .cursor_chunk_size =3D 2, + .writeback_line_buffer_buffer_size =3D 0, + .writeback_min_hscl_ratio =3D 1, + .writeback_min_vscl_ratio =3D 1, + .writeback_max_hscl_ratio =3D 1, + .writeback_max_vscl_ratio =3D 1, + .writeback_max_hscl_taps =3D 1, + .writeback_max_vscl_taps =3D 1, + .dppclk_delay_subtotal =3D 46, + .dppclk_delay_scl =3D 50, + .dppclk_delay_scl_lb_only =3D 16, + .dppclk_delay_cnvc_formatter =3D 27, + .dppclk_delay_cnvc_cursor =3D 6, + .dispclk_delay_subtotal =3D 119, + .dynamic_metadata_vm_enabled =3D false, + .odm_combine_4to1_supported =3D false, + .dcc_supported =3D true, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn3_15_soc =3D { + /*TODO: correct dispclk/dppclk voltage level determination*/ + .clock_limits =3D { + { + .state =3D 0, + .dispclk_mhz =3D 1372.0, + .dppclk_mhz =3D 1372.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 600.0, + }, + { + .state =3D 1, + .dispclk_mhz =3D 1372.0, + .dppclk_mhz =3D 1372.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 600.0, + }, + { + .state =3D 2, + .dispclk_mhz =3D 1372.0, + .dppclk_mhz =3D 1372.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 600.0, + }, + { + .state =3D 3, + .dispclk_mhz =3D 1372.0, + .dppclk_mhz =3D 1372.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 600.0, + }, + { + .state =3D 4, + .dispclk_mhz =3D 1372.0, + .dppclk_mhz =3D 1372.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 600.0, + }, + }, + .num_states =3D 5, + .sr_exit_time_us =3D 9.0, + .sr_enter_plus_exit_time_us =3D 11.0, + .sr_exit_z8_time_us =3D 50.0, + .sr_enter_plus_exit_z8_time_us =3D 50.0, + .writeback_latency_us =3D 12.0, + .dram_channel_width_bytes =3D 4, + .round_trip_ping_latency_dcfclk_cycles =3D 106, + .urgent_latency_pixel_data_only_us =3D 4.0, + .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, + .urgent_latency_vm_data_only_us =3D 4.0, + .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, + .pct_ideal_sdp_bw_after_urgent =3D 80.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 65.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 60.0, + .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 30.0, + .max_avg_sdp_bw_use_normal_percent =3D 60.0, + .max_avg_dram_bw_use_normal_percent =3D 60.0, + .fabric_datapath_to_dcn_data_return_bytes =3D 32, + .return_bus_width_bytes =3D 64, + .downspread_percent =3D 0.38, + .dcn_downspread_percent =3D 0.38, + .gpuvm_min_page_size_bytes =3D 4096, + .hostvm_min_page_size_bytes =3D 4096, + .do_urgent_latency_adjustment =3D false, + .urgent_latency_adjustment_fabric_clock_component_us =3D 0, + .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, +}; =20 void dcn31_calculate_wm_and_dlg_fp( struct dc *dc, struct dc_state *context, @@ -404,3 +548,87 @@ void dcn31_update_bw_bounding_box(struct dc *dc, struc= t clk_bw_params *bw_params else dml_init_instance(&dc->dml, &dcn3_1_soc, &dcn3_1_ip, DML_PROJECT_DCN31_F= PGA); } + +void dcn315_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw= _params) +{ + struct clk_limit_table *clk_table =3D &bw_params->clk_table; + struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; + unsigned int i, closest_clk_lvl; + int max_dispclk_mhz =3D 0, max_dppclk_mhz =3D 0; + int j; + + dc_assert_fp_enabled(); + + // Default clock levels are used for diags, which may lead to overclockin= g. + if (!IS_DIAG_DC(dc->ctx->dce_environment)) { + + dcn3_15_ip.max_num_otg =3D dc->res_pool->res_cap->num_timing_generator; + dcn3_15_ip.max_num_dpp =3D dc->res_pool->pipe_count; + dcn3_15_soc.num_chans =3D bw_params->num_channels; + + ASSERT(clk_table->num_entries); + + /* Prepass to find max clocks independent of voltage level. */ + for (i =3D 0; i < clk_table->num_entries; ++i) { + if (clk_table->entries[i].dispclk_mhz > max_dispclk_mhz) + max_dispclk_mhz =3D clk_table->entries[i].dispclk_mhz; + if (clk_table->entries[i].dppclk_mhz > max_dppclk_mhz) + max_dppclk_mhz =3D clk_table->entries[i].dppclk_mhz; + } + + for (i =3D 0; i < clk_table->num_entries; i++) { + /* loop backwards*/ + for (closest_clk_lvl =3D 0, j =3D dcn3_15_soc.num_states - 1; j >=3D 0;= j--) { + if ((unsigned int) dcn3_15_soc.clock_limits[j].dcfclk_mhz <=3D clk_tab= le->entries[i].dcfclk_mhz) { + closest_clk_lvl =3D j; + break; + } + } + if (clk_table->num_entries =3D=3D 1) { + /*smu gives one DPM level, let's take the highest one*/ + closest_clk_lvl =3D dcn3_15_soc.num_states - 1; + } + + clock_limits[i].state =3D i; + + /* Clocks dependent on voltage level. */ + clock_limits[i].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; + if (clk_table->num_entries =3D=3D 1 && + clock_limits[i].dcfclk_mhz < dcn3_15_soc.clock_limits[closest_clk_lvl]= .dcfclk_mhz) { + /*SMU fix not released yet*/ + clock_limits[i].dcfclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lv= l].dcfclk_mhz; + } + clock_limits[i].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; + clock_limits[i].socclk_mhz =3D clk_table->entries[i].socclk_mhz; + clock_limits[i].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2= * clk_table->entries[i].wck_ratio; + + /* Clocks independent of voltage level. */ + clock_limits[i].dispclk_mhz =3D max_dispclk_mhz ? max_dispclk_mhz : + dcn3_15_soc.clock_limits[closest_clk_lvl].dispclk_mhz; + + clock_limits[i].dppclk_mhz =3D max_dppclk_mhz ? max_dppclk_mhz : + dcn3_15_soc.clock_limits[closest_clk_lvl].dppclk_mhz; + + clock_limits[i].dram_bw_per_chan_gbps =3D dcn3_15_soc.clock_limits[clos= est_clk_lvl].dram_bw_per_chan_gbps; + clock_limits[i].dscclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lvl= ].dscclk_mhz; + clock_limits[i].dtbclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lvl= ].dtbclk_mhz; + clock_limits[i].phyclk_d18_mhz =3D dcn3_15_soc.clock_limits[closest_clk= _lvl].phyclk_d18_mhz; + clock_limits[i].phyclk_mhz =3D dcn3_15_soc.clock_limits[closest_clk_lvl= ].phyclk_mhz; + } + for (i =3D 0; i < clk_table->num_entries; i++) + dcn3_15_soc.clock_limits[i] =3D clock_limits[i]; + if (clk_table->num_entries) { + dcn3_15_soc.num_states =3D clk_table->num_entries; + } + } + + if (max_dispclk_mhz) { + dcn3_15_soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; + dc->dml.soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; + } + + if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) + dml_init_instance(&dc->dml, &dcn3_15_soc, &dcn3_15_ip, DML_PROJECT_DCN31= ); + else + dml_init_instance(&dc->dml, &dcn3_15_soc, &dcn3_15_ip, DML_PROJECT_DCN31= _FPGA); +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers= /gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h index baadb5150e7d..b15b587cf8c4 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h @@ -27,6 +27,8 @@ #define __DCN31_FPU_H__ =20 #define DCN3_1_DEFAULT_DET_SIZE 384 +#define DCN3_15_DEFAULT_DET_SIZE 192 +#define DCN3_15_MIN_COMPBUF_SIZE_KB 128 =20 void dcn31_calculate_wm_and_dlg_fp( struct dc *dc, struct dc_state *context, @@ -35,5 +37,6 @@ void dcn31_calculate_wm_and_dlg_fp( int vlevel); =20 void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params); +void dcn315_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw= _params); =20 #endif /* __DCN31_FPU_H__*/ --=20 2.34.1 From nobody Tue Jun 23 10:12:34 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E1F6C433EF for ; Mon, 7 Mar 2022 15:48:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243987AbiCGPtt (ORCPT ); Mon, 7 Mar 2022 10:49:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243968AbiCGPt2 (ORCPT ); Mon, 7 Mar 2022 10:49:28 -0500 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FD683FD8C for ; Mon, 7 Mar 2022 07:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=54sE5+xnxjCgp99D6lv6XxkhpURJnaNZcUY4E3mIDBc=; b=Q9cFvuoAs3P0kvMh60qthVJXVi YGMHLUlIV4nWwCerx2RxKjB4zbXtPaiwb+ZPxWx5C0NTnLAtM7iBsGtNT2hhtBCdkyXnIMT1EtdwE rjhrPrQo8XHbX1wWREv2VxbpUoDOyMCz7Wbl5tz2TwOv268Tnp3CYYxKeN5nRBP1y7TpoyCKOzFXo D6DZpPziDZ3ZZ3PKI/xxCk+co8O6gPTpV2h8py3tyWak33uqO5N861bICE6OvMj3tL5eH3q5X4IrX QxxPsNRzTWye5a4t6CD+0XEkwz4cOlrMVRHySZQ1fS4qDxlIZfhlVxALIcxKFMQn8iCNqy2lADaZQ r9PAn7Ig==; Received: from [192.168.12.191] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nRFaz-0002rG-I1; Mon, 07 Mar 2022 16:48:25 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Cc: Dmytro Laktyushkin , Jasdeep Dhillon , Qingqing Zhuo , Melissa Wen , linux-kernel@vger.kernel.org Subject: [PATCH 3/3] drm/amd/display: move FPU related code from dcn316 to dml/dcn31 folder Date: Mon, 7 Mar 2022 14:48:01 -0100 Message-Id: <20220307154801.2196284-4-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220307154801.2196284-1-mwen@igalia.com> References: <20220307154801.2196284-1-mwen@igalia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Moves FPU-related structs and dcn316_update_bw_bounding_box from dcn316 driver to dml/dcn31 that centralize FPU operations for DCN 3.1x Signed-off-by: Melissa Wen --- .../gpu/drm/amd/display/dc/dcn316/Makefile | 26 -- .../amd/display/dc/dcn316/dcn316_resource.c | 231 +----------------- .../amd/display/dc/dcn316/dcn316_resource.h | 3 + .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 229 +++++++++++++++++ .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 2 + 5 files changed, 235 insertions(+), 256 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile b/drivers/gpu/d= rm/amd/display/dc/dcn316/Makefile index cd87b687c5e2..819d44a9439b 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn316/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dcn316/Makefile @@ -25,32 +25,6 @@ =20 DCN316 =3D dcn316_resource.o =20 -ifdef CONFIG_X86 -CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o :=3D -msse -endif - -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o :=3D -mhard-float -maltiv= ec -endif - -ifdef CONFIG_CC_IS_GCC -ifeq ($(call cc-ifversion, -lt, 0701, y), y) -IS_OLD_GCC =3D 1 -endif -CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o +=3D -mhard-float -endif - -ifdef CONFIG_X86 -ifdef IS_OLD_GCC -# Stack alignment mismatch, proceed with caution. -# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-bound= ary=3D3 -# (8B stack alignment). -CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o +=3D -mpreferred-stack-bo= undary=3D4 -else -CFLAGS_$(AMDDALPATH)/dc/dcn316/dcn316_resource.o +=3D -msse2 -endif -endif - AMD_DAL_DCN316 =3D $(addprefix $(AMDDALPATH)/dc/dcn316/,$(DCN316)) =20 AMD_DISPLAY_FILES +=3D $(AMD_DAL_DCN316) diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c b/driv= ers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c index 90c17c44dd7c..1e451d069bc3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c @@ -66,6 +66,7 @@ #include "virtual/virtual_stream_encoder.h" #include "dce110/dce110_resource.h" #include "dml/display_mode_vba.h" +#include "dml/dcn31/dcn31_fpu.h" #include "dcn31/dcn31_dccg.h" #include "dcn10/dcn10_resource.h" #include "dcn31/dcn31_panel_cntl.h" @@ -123,157 +124,10 @@ =20 #include "link_enc_cfg.h" =20 -#define DC_LOGGER_INIT(logger) - -#define DCN3_16_DEFAULT_DET_SIZE 192 #define DCN3_16_MAX_DET_SIZE 384 #define DCN3_16_MIN_COMPBUF_SIZE_KB 128 #define DCN3_16_CRB_SEGMENT_SIZE_KB 64 =20 -struct _vcs_dpi_ip_params_st dcn3_16_ip =3D { - .gpuvm_enable =3D 1, - .gpuvm_max_page_table_levels =3D 1, - .hostvm_enable =3D 1, - .hostvm_max_page_table_levels =3D 2, - .rob_buffer_size_kbytes =3D 64, - .det_buffer_size_kbytes =3D DCN3_16_DEFAULT_DET_SIZE, - .config_return_buffer_size_in_kbytes =3D 1024, - .compressed_buffer_segment_size_in_kbytes =3D 64, - .meta_fifo_size_in_kentries =3D 32, - .zero_size_buffer_entries =3D 512, - .compbuf_reserved_space_64b =3D 256, - .compbuf_reserved_space_zs =3D 64, - .dpp_output_buffer_pixels =3D 2560, - .opp_output_buffer_lines =3D 1, - .pixel_chunk_size_kbytes =3D 8, - .meta_chunk_size_kbytes =3D 2, - .min_meta_chunk_size_bytes =3D 256, - .writeback_chunk_size_kbytes =3D 8, - .ptoi_supported =3D false, - .num_dsc =3D 3, - .maximum_dsc_bits_per_component =3D 10, - .dsc422_native_support =3D false, - .is_line_buffer_bpp_fixed =3D true, - .line_buffer_fixed_bpp =3D 48, - .line_buffer_size_bits =3D 789504, - .max_line_buffer_lines =3D 12, - .writeback_interface_buffer_size_kbytes =3D 90, - .max_num_dpp =3D 4, - .max_num_otg =3D 4, - .max_num_hdmi_frl_outputs =3D 1, - .max_num_wb =3D 1, - .max_dchub_pscl_bw_pix_per_clk =3D 4, - .max_pscl_lb_bw_pix_per_clk =3D 2, - .max_lb_vscl_bw_pix_per_clk =3D 4, - .max_vscl_hscl_bw_pix_per_clk =3D 4, - .max_hscl_ratio =3D 6, - .max_vscl_ratio =3D 6, - .max_hscl_taps =3D 8, - .max_vscl_taps =3D 8, - .dpte_buffer_size_in_pte_reqs_luma =3D 64, - .dpte_buffer_size_in_pte_reqs_chroma =3D 34, - .dispclk_ramp_margin_percent =3D 1, - .max_inter_dcn_tile_repeaters =3D 8, - .cursor_buffer_size =3D 16, - .cursor_chunk_size =3D 2, - .writeback_line_buffer_buffer_size =3D 0, - .writeback_min_hscl_ratio =3D 1, - .writeback_min_vscl_ratio =3D 1, - .writeback_max_hscl_ratio =3D 1, - .writeback_max_vscl_ratio =3D 1, - .writeback_max_hscl_taps =3D 1, - .writeback_max_vscl_taps =3D 1, - .dppclk_delay_subtotal =3D 46, - .dppclk_delay_scl =3D 50, - .dppclk_delay_scl_lb_only =3D 16, - .dppclk_delay_cnvc_formatter =3D 27, - .dppclk_delay_cnvc_cursor =3D 6, - .dispclk_delay_subtotal =3D 119, - .dynamic_metadata_vm_enabled =3D false, - .odm_combine_4to1_supported =3D false, - .dcc_supported =3D true, -}; - -struct _vcs_dpi_soc_bounding_box_st dcn3_16_soc =3D { - /*TODO: correct dispclk/dppclk voltage level determination*/ - .clock_limits =3D { - { - .state =3D 0, - .dispclk_mhz =3D 556.0, - .dppclk_mhz =3D 556.0, - .phyclk_mhz =3D 600.0, - .phyclk_d18_mhz =3D 445.0, - .dscclk_mhz =3D 186.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 1, - .dispclk_mhz =3D 625.0, - .dppclk_mhz =3D 625.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 209.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 2, - .dispclk_mhz =3D 625.0, - .dppclk_mhz =3D 625.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 209.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 3, - .dispclk_mhz =3D 1112.0, - .dppclk_mhz =3D 1112.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 371.0, - .dtbclk_mhz =3D 625.0, - }, - { - .state =3D 4, - .dispclk_mhz =3D 1250.0, - .dppclk_mhz =3D 1250.0, - .phyclk_mhz =3D 810.0, - .phyclk_d18_mhz =3D 667.0, - .dscclk_mhz =3D 417.0, - .dtbclk_mhz =3D 625.0, - }, - }, - .num_states =3D 5, - .sr_exit_time_us =3D 9.0, - .sr_enter_plus_exit_time_us =3D 11.0, - .sr_exit_z8_time_us =3D 442.0, - .sr_enter_plus_exit_z8_time_us =3D 560.0, - .writeback_latency_us =3D 12.0, - .dram_channel_width_bytes =3D 4, - .round_trip_ping_latency_dcfclk_cycles =3D 106, - .urgent_latency_pixel_data_only_us =3D 4.0, - .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, - .urgent_latency_vm_data_only_us =3D 4.0, - .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, - .pct_ideal_sdp_bw_after_urgent =3D 80.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 65.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 60.0, - .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 30.0, - .max_avg_sdp_bw_use_normal_percent =3D 60.0, - .max_avg_dram_bw_use_normal_percent =3D 60.0, - .fabric_datapath_to_dcn_data_return_bytes =3D 32, - .return_bus_width_bytes =3D 64, - .downspread_percent =3D 0.38, - .dcn_downspread_percent =3D 0.5, - .gpuvm_min_page_size_bytes =3D 4096, - .hostvm_min_page_size_bytes =3D 4096, - .do_urgent_latency_adjustment =3D false, - .urgent_latency_adjustment_fabric_clock_component_us =3D 0, - .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, -}; - enum dcn31_clk_src_array_id { DCN31_CLK_SRC_PLL0, DCN31_CLK_SRC_PLL1, @@ -1859,89 +1713,6 @@ static struct dc_cap_funcs cap_funcs =3D { .get_dcc_compression_cap =3D dcn20_get_dcc_compression_cap }; =20 -static void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_par= ams *bw_params) -{ - struct clk_limit_table *clk_table =3D &bw_params->clk_table; - struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; - unsigned int i, closest_clk_lvl; - int max_dispclk_mhz =3D 0, max_dppclk_mhz =3D 0; - int j; - - // Default clock levels are used for diags, which may lead to overclockin= g. - if (!IS_DIAG_DC(dc->ctx->dce_environment)) { - - dcn3_16_ip.max_num_otg =3D dc->res_pool->res_cap->num_timing_generator; - dcn3_16_ip.max_num_dpp =3D dc->res_pool->pipe_count; - dcn3_16_soc.num_chans =3D bw_params->num_channels; - - ASSERT(clk_table->num_entries); - - /* Prepass to find max clocks independent of voltage level. */ - for (i =3D 0; i < clk_table->num_entries; ++i) { - if (clk_table->entries[i].dispclk_mhz > max_dispclk_mhz) - max_dispclk_mhz =3D clk_table->entries[i].dispclk_mhz; - if (clk_table->entries[i].dppclk_mhz > max_dppclk_mhz) - max_dppclk_mhz =3D clk_table->entries[i].dppclk_mhz; - } - - for (i =3D 0; i < clk_table->num_entries; i++) { - /* loop backwards*/ - for (closest_clk_lvl =3D 0, j =3D dcn3_16_soc.num_states - 1; j >=3D 0;= j--) { - if ((unsigned int) dcn3_16_soc.clock_limits[j].dcfclk_mhz <=3D clk_tab= le->entries[i].dcfclk_mhz) { - closest_clk_lvl =3D j; - break; - } - } - // Ported from DCN315 - if (clk_table->num_entries =3D=3D 1) { - /*smu gives one DPM level, let's take the highest one*/ - closest_clk_lvl =3D dcn3_16_soc.num_states - 1; - } - - clock_limits[i].state =3D i; - - /* Clocks dependent on voltage level. */ - clock_limits[i].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; - if (clk_table->num_entries =3D=3D 1 && - clock_limits[i].dcfclk_mhz < dcn3_16_soc.clock_limits[closest_clk_lvl]= .dcfclk_mhz) { - /*SMU fix not released yet*/ - clock_limits[i].dcfclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lv= l].dcfclk_mhz; - } - clock_limits[i].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; - clock_limits[i].socclk_mhz =3D clk_table->entries[i].socclk_mhz; - clock_limits[i].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2= * clk_table->entries[i].wck_ratio; - - /* Clocks independent of voltage level. */ - clock_limits[i].dispclk_mhz =3D max_dispclk_mhz ? max_dispclk_mhz : - dcn3_16_soc.clock_limits[closest_clk_lvl].dispclk_mhz; - - clock_limits[i].dppclk_mhz =3D max_dppclk_mhz ? max_dppclk_mhz : - dcn3_16_soc.clock_limits[closest_clk_lvl].dppclk_mhz; - - clock_limits[i].dram_bw_per_chan_gbps =3D dcn3_16_soc.clock_limits[clos= est_clk_lvl].dram_bw_per_chan_gbps; - clock_limits[i].dscclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lvl= ].dscclk_mhz; - clock_limits[i].dtbclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lvl= ].dtbclk_mhz; - clock_limits[i].phyclk_d18_mhz =3D dcn3_16_soc.clock_limits[closest_clk= _lvl].phyclk_d18_mhz; - clock_limits[i].phyclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lvl= ].phyclk_mhz; - } - for (i =3D 0; i < clk_table->num_entries; i++) - dcn3_16_soc.clock_limits[i] =3D clock_limits[i]; - if (clk_table->num_entries) { - dcn3_16_soc.num_states =3D clk_table->num_entries; - } - } - - if (max_dispclk_mhz) { - dcn3_16_soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; - dc->dml.soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; - } - - if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) - dml_init_instance(&dc->dml, &dcn3_16_soc, &dcn3_16_ip, DML_PROJECT_DCN31= ); - else - dml_init_instance(&dc->dml, &dcn3_16_soc, &dcn3_16_ip, DML_PROJECT_DCN31= _FPGA); -} - static struct resource_funcs dcn316_res_pool_funcs =3D { .destroy =3D dcn316_destroy_resource_pool, .link_enc_create =3D dcn31_link_encoder_create, diff --git a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.h b/driv= ers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.h index 9d0d60cb9482..0dc5a6c13ae7 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.h @@ -31,6 +31,9 @@ #define TO_DCN316_RES_POOL(pool)\ container_of(pool, struct dcn316_resource_pool, base) =20 +extern struct _vcs_dpi_ip_params_st dcn3_16_ip; +extern struct _vcs_dpi_ip_params_st dcn3_16_soc; + struct dcn316_resource_pool { struct resource_pool base; }; diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers= /gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c index f70b47ef850c..a0a2e125c9c8 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c @@ -339,6 +339,150 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_15_soc =3D { .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, }; =20 +struct _vcs_dpi_ip_params_st dcn3_16_ip =3D { + .gpuvm_enable =3D 1, + .gpuvm_max_page_table_levels =3D 1, + .hostvm_enable =3D 1, + .hostvm_max_page_table_levels =3D 2, + .rob_buffer_size_kbytes =3D 64, + .det_buffer_size_kbytes =3D DCN3_16_DEFAULT_DET_SIZE, + .config_return_buffer_size_in_kbytes =3D 1024, + .compressed_buffer_segment_size_in_kbytes =3D 64, + .meta_fifo_size_in_kentries =3D 32, + .zero_size_buffer_entries =3D 512, + .compbuf_reserved_space_64b =3D 256, + .compbuf_reserved_space_zs =3D 64, + .dpp_output_buffer_pixels =3D 2560, + .opp_output_buffer_lines =3D 1, + .pixel_chunk_size_kbytes =3D 8, + .meta_chunk_size_kbytes =3D 2, + .min_meta_chunk_size_bytes =3D 256, + .writeback_chunk_size_kbytes =3D 8, + .ptoi_supported =3D false, + .num_dsc =3D 3, + .maximum_dsc_bits_per_component =3D 10, + .dsc422_native_support =3D false, + .is_line_buffer_bpp_fixed =3D true, + .line_buffer_fixed_bpp =3D 48, + .line_buffer_size_bits =3D 789504, + .max_line_buffer_lines =3D 12, + .writeback_interface_buffer_size_kbytes =3D 90, + .max_num_dpp =3D 4, + .max_num_otg =3D 4, + .max_num_hdmi_frl_outputs =3D 1, + .max_num_wb =3D 1, + .max_dchub_pscl_bw_pix_per_clk =3D 4, + .max_pscl_lb_bw_pix_per_clk =3D 2, + .max_lb_vscl_bw_pix_per_clk =3D 4, + .max_vscl_hscl_bw_pix_per_clk =3D 4, + .max_hscl_ratio =3D 6, + .max_vscl_ratio =3D 6, + .max_hscl_taps =3D 8, + .max_vscl_taps =3D 8, + .dpte_buffer_size_in_pte_reqs_luma =3D 64, + .dpte_buffer_size_in_pte_reqs_chroma =3D 34, + .dispclk_ramp_margin_percent =3D 1, + .max_inter_dcn_tile_repeaters =3D 8, + .cursor_buffer_size =3D 16, + .cursor_chunk_size =3D 2, + .writeback_line_buffer_buffer_size =3D 0, + .writeback_min_hscl_ratio =3D 1, + .writeback_min_vscl_ratio =3D 1, + .writeback_max_hscl_ratio =3D 1, + .writeback_max_vscl_ratio =3D 1, + .writeback_max_hscl_taps =3D 1, + .writeback_max_vscl_taps =3D 1, + .dppclk_delay_subtotal =3D 46, + .dppclk_delay_scl =3D 50, + .dppclk_delay_scl_lb_only =3D 16, + .dppclk_delay_cnvc_formatter =3D 27, + .dppclk_delay_cnvc_cursor =3D 6, + .dispclk_delay_subtotal =3D 119, + .dynamic_metadata_vm_enabled =3D false, + .odm_combine_4to1_supported =3D false, + .dcc_supported =3D true, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn3_16_soc =3D { + /*TODO: correct dispclk/dppclk voltage level determination*/ + .clock_limits =3D { + { + .state =3D 0, + .dispclk_mhz =3D 556.0, + .dppclk_mhz =3D 556.0, + .phyclk_mhz =3D 600.0, + .phyclk_d18_mhz =3D 445.0, + .dscclk_mhz =3D 186.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 1, + .dispclk_mhz =3D 625.0, + .dppclk_mhz =3D 625.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 209.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 2, + .dispclk_mhz =3D 625.0, + .dppclk_mhz =3D 625.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 209.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 3, + .dispclk_mhz =3D 1112.0, + .dppclk_mhz =3D 1112.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 371.0, + .dtbclk_mhz =3D 625.0, + }, + { + .state =3D 4, + .dispclk_mhz =3D 1250.0, + .dppclk_mhz =3D 1250.0, + .phyclk_mhz =3D 810.0, + .phyclk_d18_mhz =3D 667.0, + .dscclk_mhz =3D 417.0, + .dtbclk_mhz =3D 625.0, + }, + }, + .num_states =3D 5, + .sr_exit_time_us =3D 9.0, + .sr_enter_plus_exit_time_us =3D 11.0, + .sr_exit_z8_time_us =3D 442.0, + .sr_enter_plus_exit_z8_time_us =3D 560.0, + .writeback_latency_us =3D 12.0, + .dram_channel_width_bytes =3D 4, + .round_trip_ping_latency_dcfclk_cycles =3D 106, + .urgent_latency_pixel_data_only_us =3D 4.0, + .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, + .urgent_latency_vm_data_only_us =3D 4.0, + .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, + .pct_ideal_sdp_bw_after_urgent =3D 80.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 65.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 60.0, + .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 30.0, + .max_avg_sdp_bw_use_normal_percent =3D 60.0, + .max_avg_dram_bw_use_normal_percent =3D 60.0, + .fabric_datapath_to_dcn_data_return_bytes =3D 32, + .return_bus_width_bytes =3D 64, + .downspread_percent =3D 0.38, + .dcn_downspread_percent =3D 0.5, + .gpuvm_min_page_size_bytes =3D 4096, + .hostvm_min_page_size_bytes =3D 4096, + .do_urgent_latency_adjustment =3D false, + .urgent_latency_adjustment_fabric_clock_component_us =3D 0, + .urgent_latency_adjustment_fabric_clock_reference_mhz =3D 0, +}; + void dcn31_calculate_wm_and_dlg_fp( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, @@ -632,3 +776,88 @@ void dcn315_update_bw_bounding_box(struct dc *dc, stru= ct clk_bw_params *bw_param else dml_init_instance(&dc->dml, &dcn3_15_soc, &dcn3_15_ip, DML_PROJECT_DCN31= _FPGA); } + +void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw= _params) +{ + struct clk_limit_table *clk_table =3D &bw_params->clk_table; + struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; + unsigned int i, closest_clk_lvl; + int max_dispclk_mhz =3D 0, max_dppclk_mhz =3D 0; + int j; + + dc_assert_fp_enabled(); + + // Default clock levels are used for diags, which may lead to overclockin= g. + if (!IS_DIAG_DC(dc->ctx->dce_environment)) { + + dcn3_16_ip.max_num_otg =3D dc->res_pool->res_cap->num_timing_generator; + dcn3_16_ip.max_num_dpp =3D dc->res_pool->pipe_count; + dcn3_16_soc.num_chans =3D bw_params->num_channels; + + ASSERT(clk_table->num_entries); + + /* Prepass to find max clocks independent of voltage level. */ + for (i =3D 0; i < clk_table->num_entries; ++i) { + if (clk_table->entries[i].dispclk_mhz > max_dispclk_mhz) + max_dispclk_mhz =3D clk_table->entries[i].dispclk_mhz; + if (clk_table->entries[i].dppclk_mhz > max_dppclk_mhz) + max_dppclk_mhz =3D clk_table->entries[i].dppclk_mhz; + } + + for (i =3D 0; i < clk_table->num_entries; i++) { + /* loop backwards*/ + for (closest_clk_lvl =3D 0, j =3D dcn3_16_soc.num_states - 1; j >=3D 0;= j--) { + if ((unsigned int) dcn3_16_soc.clock_limits[j].dcfclk_mhz <=3D clk_tab= le->entries[i].dcfclk_mhz) { + closest_clk_lvl =3D j; + break; + } + } + // Ported from DCN315 + if (clk_table->num_entries =3D=3D 1) { + /*smu gives one DPM level, let's take the highest one*/ + closest_clk_lvl =3D dcn3_16_soc.num_states - 1; + } + + clock_limits[i].state =3D i; + + /* Clocks dependent on voltage level. */ + clock_limits[i].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; + if (clk_table->num_entries =3D=3D 1 && + clock_limits[i].dcfclk_mhz < dcn3_16_soc.clock_limits[closest_clk_lvl]= .dcfclk_mhz) { + /*SMU fix not released yet*/ + clock_limits[i].dcfclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lv= l].dcfclk_mhz; + } + clock_limits[i].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; + clock_limits[i].socclk_mhz =3D clk_table->entries[i].socclk_mhz; + clock_limits[i].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2= * clk_table->entries[i].wck_ratio; + + /* Clocks independent of voltage level. */ + clock_limits[i].dispclk_mhz =3D max_dispclk_mhz ? max_dispclk_mhz : + dcn3_16_soc.clock_limits[closest_clk_lvl].dispclk_mhz; + + clock_limits[i].dppclk_mhz =3D max_dppclk_mhz ? max_dppclk_mhz : + dcn3_16_soc.clock_limits[closest_clk_lvl].dppclk_mhz; + + clock_limits[i].dram_bw_per_chan_gbps =3D dcn3_16_soc.clock_limits[clos= est_clk_lvl].dram_bw_per_chan_gbps; + clock_limits[i].dscclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lvl= ].dscclk_mhz; + clock_limits[i].dtbclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lvl= ].dtbclk_mhz; + clock_limits[i].phyclk_d18_mhz =3D dcn3_16_soc.clock_limits[closest_clk= _lvl].phyclk_d18_mhz; + clock_limits[i].phyclk_mhz =3D dcn3_16_soc.clock_limits[closest_clk_lvl= ].phyclk_mhz; + } + for (i =3D 0; i < clk_table->num_entries; i++) + dcn3_16_soc.clock_limits[i] =3D clock_limits[i]; + if (clk_table->num_entries) { + dcn3_16_soc.num_states =3D clk_table->num_entries; + } + } + + if (max_dispclk_mhz) { + dcn3_16_soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; + dc->dml.soc.dispclk_dppclk_vco_speed_mhz =3D max_dispclk_mhz * 2; + } + + if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment)) + dml_init_instance(&dc->dml, &dcn3_16_soc, &dcn3_16_ip, DML_PROJECT_DCN31= ); + else + dml_init_instance(&dc->dml, &dcn3_16_soc, &dcn3_16_ip, DML_PROJECT_DCN31= _FPGA); +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h b/drivers= /gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h index b15b587cf8c4..24ac19c83687 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h @@ -29,6 +29,7 @@ #define DCN3_1_DEFAULT_DET_SIZE 384 #define DCN3_15_DEFAULT_DET_SIZE 192 #define DCN3_15_MIN_COMPBUF_SIZE_KB 128 +#define DCN3_16_DEFAULT_DET_SIZE 192 =20 void dcn31_calculate_wm_and_dlg_fp( struct dc *dc, struct dc_state *context, @@ -38,5 +39,6 @@ void dcn31_calculate_wm_and_dlg_fp( =20 void dcn31_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params); void dcn315_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw= _params); +void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw= _params); =20 #endif /* __DCN31_FPU_H__*/ --=20 2.34.1