From nobody Tue Jun 23 18:20:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D71FC433EF for ; Mon, 28 Feb 2022 21:11:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230058AbiB1VM1 (ORCPT ); Mon, 28 Feb 2022 16:12:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229516AbiB1VMZ (ORCPT ); Mon, 28 Feb 2022 16:12:25 -0500 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79FDF4348F for ; Mon, 28 Feb 2022 13:11:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=vFXJVU/zkUlvWZ15xcOen8+g6QF5if2TYI4v/ng6ZkQ=; b=iyaXGiXUD04oqZrIYyem791UAj gEnbaEzEppTEOadZpvCUpeXm9kdRSOuY5BhOzEx3oHUlZCML2Ph0o3SrCAyQcfdhfvwqs2Hqt4qsk T+CfKLPMEkigRVcimAV+fNRohGqKD1SjWp1Ke7I8KtLd+KL6tCnBELreEdyR/zsYrwafUrPw5yFf9 wRjFwYYySZB2+U8twZv01RI8LPZY+UvvZamknqot+fbVgh3FmbJ8OCXY7KYBY+qOfDcvzwJh+igGl BH7gWCPfXrn1//T9GXOSiERaB5DhpmJuKRR/BURxSejf6/98HbWCzAeZhG8QR6/fEx8uz5+VzFLQC augarS5g==; Received: from [165.90.126.25] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nOnIv-000CFy-Ao; Mon, 28 Feb 2022 22:11:37 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Cc: Dmytro Laktyushkin , Jasdeep Dhillon , Qingqing Zhuo , Melissa Wen , linux-kernel@vger.kernel.org Subject: [PATCH 1/2] drm/amd/display: move FPU operations from dcn21 to dml/dcn20 folder Date: Mon, 28 Feb 2022 20:10:46 -0100 Message-Id: <20220228211047.3957945-2-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220228211047.3957945-1-mwen@igalia.com> References: <20220228211047.3957945-1-mwen@igalia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" dml/dcn20_fpu file centralizes all DCN2x functions that require FPU access. Therefore, this patch moves FPU-related code from dcn21 to dcn20_fpu. These include: - dcn21_populate_dml_pipes_from_context() - dcn21_validate_bandwidth_fp() and related: - dcn21_calculate_wm(), - patch_bounding_box(), - calculate_wm_set_for_vlevel() - renaming update_bw_bounding_box() to dcn21_update_bw_bounding_box(), move to dcn20_fpu with related static function construct_low_pstate_lvl() Also, make dcn21_fast_validate_bw() public in dcn21_resource as it is called by dcn21_validate_bandwidth_fp() now in dcn20_fpu. Reuse dcn20_fpu_adjust_dppclk() in dcn21_fast_validate_bw() as it isolates the same FPU operation. Include dchubbub.h as it is required in dcn21_populate_dml_pipes_from_conte= xt() Signed-off-by: Melissa Wen --- drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 25 - .../drm/amd/display/dc/dcn21/dcn21_resource.c | 566 +----------------- .../drm/amd/display/dc/dcn21/dcn21_resource.h | 11 + .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 538 ++++++++++++++++- .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.h | 9 + 5 files changed, 571 insertions(+), 578 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile b/drivers/gpu/dr= m/amd/display/dc/dcn21/Makefile index bb8c95141082..0dc06e428999 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile @@ -5,31 +5,6 @@ DCN21 =3D dcn21_init.o dcn21_hubp.o dcn21_hubbub.o dcn21_resource.o \ dcn21_hwseq.o dcn21_link_encoder.o dcn21_dccg.o =20 -ifdef CONFIG_X86 -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o :=3D -mhard-float -msse -endif - -ifdef CONFIG_PPC64 -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o :=3D -mhard-float -maltivec -endif - -ifdef CONFIG_CC_IS_GCC -ifeq ($(call cc-ifversion, -lt, 0701, y), y) -IS_OLD_GCC =3D 1 -endif -endif - -ifdef CONFIG_X86 -ifdef IS_OLD_GCC -# Stack alignment mismatch, proceed with caution. -# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-bound= ary=3D3 -# (8B stack alignment). -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o +=3D -mpreferred-stack-boun= dary=3D4 -else -CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o +=3D -msse2 -endif -endif - AMD_DAL_DCN21 =3D $(addprefix $(AMDDALPATH)/dc/dcn21/,$(DCN21)) =20 AMD_DISPLAY_FILES +=3D $(AMD_DAL_DCN21) diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c b/driver= s/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c index c1cd1a8ff1d7..612732656772 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c @@ -34,6 +34,7 @@ #include "resource.h" #include "include/irq_service_interface.h" #include "dcn20/dcn20_resource.h" +#include "dcn21/dcn21_resource.h" =20 #include "dml/dcn20/dcn20_fpu.h" =20 @@ -89,230 +90,6 @@ #include "dce/dmub_psr.h" #include "dce/dmub_abm.h" =20 -#define DC_LOGGER_INIT(logger) - - -struct _vcs_dpi_ip_params_st dcn2_1_ip =3D { - .odm_capable =3D 1, - .gpuvm_enable =3D 1, - .hostvm_enable =3D 1, - .gpuvm_max_page_table_levels =3D 1, - .hostvm_max_page_table_levels =3D 4, - .hostvm_cached_page_table_levels =3D 2, - .num_dsc =3D 3, - .rob_buffer_size_kbytes =3D 168, - .det_buffer_size_kbytes =3D 164, - .dpte_buffer_size_in_pte_reqs_luma =3D 44, - .dpte_buffer_size_in_pte_reqs_chroma =3D 42,//todo - .dpp_output_buffer_pixels =3D 2560, - .opp_output_buffer_lines =3D 1, - .pixel_chunk_size_kbytes =3D 8, - .pte_enable =3D 1, - .max_page_table_levels =3D 4, - .pte_chunk_size_kbytes =3D 2, - .meta_chunk_size_kbytes =3D 2, - .min_meta_chunk_size_bytes =3D 256, - .writeback_chunk_size_kbytes =3D 2, - .line_buffer_size_bits =3D 789504, - .is_line_buffer_bpp_fixed =3D 0, - .line_buffer_fixed_bpp =3D 0, - .dcc_supported =3D true, - .max_line_buffer_lines =3D 12, - .writeback_luma_buffer_size_kbytes =3D 12, - .writeback_chroma_buffer_size_kbytes =3D 8, - .writeback_chroma_line_buffer_width_pixels =3D 4, - .writeback_max_hscl_ratio =3D 1, - .writeback_max_vscl_ratio =3D 1, - .writeback_min_hscl_ratio =3D 1, - .writeback_min_vscl_ratio =3D 1, - .writeback_max_hscl_taps =3D 12, - .writeback_max_vscl_taps =3D 12, - .writeback_line_buffer_luma_buffer_size =3D 0, - .writeback_line_buffer_chroma_buffer_size =3D 14643, - .cursor_buffer_size =3D 8, - .cursor_chunk_size =3D 2, - .max_num_otg =3D 4, - .max_num_dpp =3D 4, - .max_num_wb =3D 1, - .max_dchub_pscl_bw_pix_per_clk =3D 4, - .max_pscl_lb_bw_pix_per_clk =3D 2, - .max_lb_vscl_bw_pix_per_clk =3D 4, - .max_vscl_hscl_bw_pix_per_clk =3D 4, - .max_hscl_ratio =3D 4, - .max_vscl_ratio =3D 4, - .hscl_mults =3D 4, - .vscl_mults =3D 4, - .max_hscl_taps =3D 8, - .max_vscl_taps =3D 8, - .dispclk_ramp_margin_percent =3D 1, - .underscan_factor =3D 1.10, - .min_vblank_lines =3D 32, // - .dppclk_delay_subtotal =3D 77, // - .dppclk_delay_scl_lb_only =3D 16, - .dppclk_delay_scl =3D 50, - .dppclk_delay_cnvc_formatter =3D 8, - .dppclk_delay_cnvc_cursor =3D 6, - .dispclk_delay_subtotal =3D 87, // - .dcfclk_cstate_latency =3D 10, // SRExitTime - .max_inter_dcn_tile_repeaters =3D 8, - - .xfc_supported =3D false, - .xfc_fill_bw_overhead_percent =3D 10.0, - .xfc_fill_constant_bytes =3D 0, - .ptoi_supported =3D 0, - .number_of_cursors =3D 1, -}; - -struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc =3D { - .clock_limits =3D { - { - .state =3D 0, - .dcfclk_mhz =3D 400.0, - .fabricclk_mhz =3D 400.0, - .dispclk_mhz =3D 600.0, - .dppclk_mhz =3D 400.00, - .phyclk_mhz =3D 600.0, - .socclk_mhz =3D 278.0, - .dscclk_mhz =3D 205.67, - .dram_speed_mts =3D 1600.0, - }, - { - .state =3D 1, - .dcfclk_mhz =3D 464.52, - .fabricclk_mhz =3D 800.0, - .dispclk_mhz =3D 654.55, - .dppclk_mhz =3D 626.09, - .phyclk_mhz =3D 600.0, - .socclk_mhz =3D 278.0, - .dscclk_mhz =3D 205.67, - .dram_speed_mts =3D 1600.0, - }, - { - .state =3D 2, - .dcfclk_mhz =3D 514.29, - .fabricclk_mhz =3D 933.0, - .dispclk_mhz =3D 757.89, - .dppclk_mhz =3D 685.71, - .phyclk_mhz =3D 600.0, - .socclk_mhz =3D 278.0, - .dscclk_mhz =3D 287.67, - .dram_speed_mts =3D 1866.0, - }, - { - .state =3D 3, - .dcfclk_mhz =3D 576.00, - .fabricclk_mhz =3D 1067.0, - .dispclk_mhz =3D 847.06, - .dppclk_mhz =3D 757.89, - .phyclk_mhz =3D 600.0, - .socclk_mhz =3D 715.0, - .dscclk_mhz =3D 318.334, - .dram_speed_mts =3D 2134.0, - }, - { - .state =3D 4, - .dcfclk_mhz =3D 626.09, - .fabricclk_mhz =3D 1200.0, - .dispclk_mhz =3D 900.00, - .dppclk_mhz =3D 847.06, - .phyclk_mhz =3D 810.0, - .socclk_mhz =3D 953.0, - .dscclk_mhz =3D 489.0, - .dram_speed_mts =3D 2400.0, - }, - { - .state =3D 5, - .dcfclk_mhz =3D 685.71, - .fabricclk_mhz =3D 1333.0, - .dispclk_mhz =3D 1028.57, - .dppclk_mhz =3D 960.00, - .phyclk_mhz =3D 810.0, - .socclk_mhz =3D 278.0, - .dscclk_mhz =3D 287.67, - .dram_speed_mts =3D 2666.0, - }, - { - .state =3D 6, - .dcfclk_mhz =3D 757.89, - .fabricclk_mhz =3D 1467.0, - .dispclk_mhz =3D 1107.69, - .dppclk_mhz =3D 1028.57, - .phyclk_mhz =3D 810.0, - .socclk_mhz =3D 715.0, - .dscclk_mhz =3D 318.334, - .dram_speed_mts =3D 3200.0, - }, - { - .state =3D 7, - .dcfclk_mhz =3D 847.06, - .fabricclk_mhz =3D 1600.0, - .dispclk_mhz =3D 1395.0, - .dppclk_mhz =3D 1285.00, - .phyclk_mhz =3D 1325.0, - .socclk_mhz =3D 953.0, - .dscclk_mhz =3D 489.0, - .dram_speed_mts =3D 4266.0, - }, - /*Extra state, no dispclk ramping*/ - { - .state =3D 8, - .dcfclk_mhz =3D 847.06, - .fabricclk_mhz =3D 1600.0, - .dispclk_mhz =3D 1395.0, - .dppclk_mhz =3D 1285.0, - .phyclk_mhz =3D 1325.0, - .socclk_mhz =3D 953.0, - .dscclk_mhz =3D 489.0, - .dram_speed_mts =3D 4266.0, - }, - - }, - - .sr_exit_time_us =3D 12.5, - .sr_enter_plus_exit_time_us =3D 17.0, - .urgent_latency_us =3D 4.0, - .urgent_latency_pixel_data_only_us =3D 4.0, - .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, - .urgent_latency_vm_data_only_us =3D 4.0, - .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, - .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 80.0, - .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 75.0, - .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 40.0, - .max_avg_sdp_bw_use_normal_percent =3D 60.0, - .max_avg_dram_bw_use_normal_percent =3D 100.0, - .writeback_latency_us =3D 12.0, - .max_request_size_bytes =3D 256, - .dram_channel_width_bytes =3D 4, - .fabric_datapath_to_dcn_data_return_bytes =3D 32, - .dcn_downspread_percent =3D 0.5, - .downspread_percent =3D 0.38, - .dram_page_open_time_ns =3D 50.0, - .dram_rw_turnaround_time_ns =3D 17.5, - .dram_return_buffer_per_channel_bytes =3D 8192, - .round_trip_ping_latency_dcfclk_cycles =3D 128, - .urgent_out_of_order_return_per_channel_bytes =3D 4096, - .channel_interleave_bytes =3D 256, - .num_banks =3D 8, - .num_chans =3D 4, - .vmm_page_size_bytes =3D 4096, - .dram_clock_change_latency_us =3D 23.84, - .return_bus_width_bytes =3D 64, - .dispclk_dppclk_vco_speed_mhz =3D 3600, - .xfc_bus_transport_time_us =3D 4, - .xfc_xbuf_latency_tolerance_us =3D 4, - .use_urgent_burst_bw =3D 1, - .num_states =3D 8 -}; - -#ifndef MAX -#define MAX(X, Y) ((X) > (Y) ? (X) : (Y)) -#endif -#ifndef MIN -#define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) -#endif - /* begin ********************* * macros to expend register list macro defined in HW object header file */ =20 @@ -705,12 +482,6 @@ static const struct dcn10_stream_encoder_mask se_mask = =3D { =20 static void dcn21_pp_smu_destroy(struct pp_smu_funcs **pp_smu); =20 -static int dcn21_populate_dml_pipes_from_context( - struct dc *dc, - struct dc_state *context, - display_e2e_pipe_params_st *pipes, - bool fast_validate); - static struct input_pixel_processor *dcn21_ipp_create( struct dc_context *ctx, uint32_t inst) { @@ -1029,163 +800,13 @@ static void dcn21_resource_destruct(struct dcn21_re= source_pool *pool) dcn21_pp_smu_destroy(&pool->base.pp_smu); } =20 - -static void calculate_wm_set_for_vlevel( - int vlevel, - struct wm_range_table_entry *table_entry, - struct dcn_watermarks *wm_set, - struct display_mode_lib *dml, - display_e2e_pipe_params_st *pipes, - int pipe_cnt) -{ - double dram_clock_change_latency_cached =3D dml->soc.dram_clock_change_la= tency_us; - - ASSERT(vlevel < dml->soc.num_states); - /* only pipe 0 is read for voltage and dcf/soc clocks */ - pipes[0].clks_cfg.voltage =3D vlevel; - pipes[0].clks_cfg.dcfclk_mhz =3D dml->soc.clock_limits[vlevel].dcfclk_mhz; - pipes[0].clks_cfg.socclk_mhz =3D dml->soc.clock_limits[vlevel].socclk_mhz; - - dml->soc.dram_clock_change_latency_us =3D table_entry->pstate_latency_us; - dml->soc.sr_exit_time_us =3D table_entry->sr_exit_time_us; - dml->soc.sr_enter_plus_exit_time_us =3D table_entry->sr_enter_plus_exit_t= ime_us; - - wm_set->urgent_ns =3D get_wm_urgent(dml, pipes, pipe_cnt) * 1000; - wm_set->cstate_pstate.cstate_enter_plus_exit_ns =3D get_wm_stutter_enter_= exit(dml, pipes, pipe_cnt) * 1000; - wm_set->cstate_pstate.cstate_exit_ns =3D get_wm_stutter_exit(dml, pipes, = pipe_cnt) * 1000; - wm_set->cstate_pstate.pstate_change_ns =3D get_wm_dram_clock_change(dml, = pipes, pipe_cnt) * 1000; - wm_set->pte_meta_urgent_ns =3D get_wm_memory_trip(dml, pipes, pipe_cnt) *= 1000; - wm_set->frac_urg_bw_nom =3D get_fraction_of_urgent_bandwidth(dml, pipes, = pipe_cnt) * 1000; - wm_set->frac_urg_bw_flip =3D get_fraction_of_urgent_bandwidth_imm_flip(dm= l, pipes, pipe_cnt) * 1000; - wm_set->urgent_latency_ns =3D get_urgent_latency(dml, pipes, pipe_cnt) * = 1000; - dml->soc.dram_clock_change_latency_us =3D dram_clock_change_latency_cache= d; - -} - -static void patch_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding= _box_st *bb) -{ - int i; - - if (dc->bb_overrides.sr_exit_time_ns) { - for (i =3D 0; i < WM_SET_COUNT; i++) { - dc->clk_mgr->bw_params->wm_table.entries[i].sr_exit_time_us =3D - dc->bb_overrides.sr_exit_time_ns / 1000.0; - } - } - - if (dc->bb_overrides.sr_enter_plus_exit_time_ns) { - for (i =3D 0; i < WM_SET_COUNT; i++) { - dc->clk_mgr->bw_params->wm_table.entries[i].sr_enter_plus_exit_time_u= s =3D - dc->bb_overrides.sr_enter_plus_exit_time_ns / 1000.0; - } - } - - if (dc->bb_overrides.urgent_latency_ns) { - bb->urgent_latency_us =3D dc->bb_overrides.urgent_latency_ns / 1000.0; - } - - if (dc->bb_overrides.dram_clock_change_latency_ns) { - for (i =3D 0; i < WM_SET_COUNT; i++) { - dc->clk_mgr->bw_params->wm_table.entries[i].pstate_latency_us =3D - dc->bb_overrides.dram_clock_change_latency_ns / 1000.0; - } - } -} - -static void dcn21_calculate_wm( - struct dc *dc, struct dc_state *context, - display_e2e_pipe_params_st *pipes, - int *out_pipe_cnt, - int *pipe_split_from, - int vlevel_req, - bool fast_validate) -{ - int pipe_cnt, i, pipe_idx; - int vlevel, vlevel_max; - struct wm_range_table_entry *table_entry; - struct clk_bw_params *bw_params =3D dc->clk_mgr->bw_params; - - ASSERT(bw_params); - - patch_bounding_box(dc, &context->bw_ctx.dml.soc); - - for (i =3D 0, pipe_idx =3D 0, pipe_cnt =3D 0; i < dc->res_pool->pipe_coun= t; i++) { - if (!context->res_ctx.pipe_ctx[i].stream) - continue; - - pipes[pipe_cnt].clks_cfg.refclk_mhz =3D dc->res_pool->ref_clocks.dchub_= ref_clock_inKhz / 1000.0; - pipes[pipe_cnt].clks_cfg.dispclk_mhz =3D context->bw_ctx.dml.vba.Requir= edDISPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb]; - - if (pipe_split_from[i] < 0) { - pipes[pipe_cnt].clks_cfg.dppclk_mhz =3D - context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.d= ml.vba.maxMpcComb][pipe_idx]; - if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_idx] =3D=3D pipe_id= x) - pipes[pipe_cnt].pipe.dest.odm_combine =3D - context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_i= dx]; - else - pipes[pipe_cnt].pipe.dest.odm_combine =3D 0; - pipe_idx++; - } else { - pipes[pipe_cnt].clks_cfg.dppclk_mhz =3D - context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.d= ml.vba.maxMpcComb][pipe_split_from[i]]; - if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_split_from[i]] =3D= =3D pipe_split_from[i]) - pipes[pipe_cnt].pipe.dest.odm_combine =3D - context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_s= plit_from[i]]; - else - pipes[pipe_cnt].pipe.dest.odm_combine =3D 0; - } - pipe_cnt++; - } - - if (pipe_cnt !=3D pipe_idx) { - if (dc->res_pool->funcs->populate_dml_pipes) - pipe_cnt =3D dc->res_pool->funcs->populate_dml_pipes(dc, - context, pipes, fast_validate); - else - pipe_cnt =3D dcn21_populate_dml_pipes_from_context(dc, - context, pipes, fast_validate); - } - - *out_pipe_cnt =3D pipe_cnt; - - vlevel_max =3D bw_params->clk_table.num_entries - 1; - - - /* WM Set D */ - table_entry =3D &bw_params->wm_table.entries[WM_D]; - if (table_entry->wm_type =3D=3D WM_TYPE_RETRAINING) - vlevel =3D 0; - else - vlevel =3D vlevel_max; - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.d, - &context->bw_ctx.dml, pipes, pipe_cnt); - /* WM Set C */ - table_entry =3D &bw_params->wm_table.entries[WM_C]; - vlevel =3D MIN(MAX(vlevel_req, 3), vlevel_max); - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.c, - &context->bw_ctx.dml, pipes, pipe_cnt); - /* WM Set B */ - table_entry =3D &bw_params->wm_table.entries[WM_B]; - vlevel =3D MIN(MAX(vlevel_req, 2), vlevel_max); - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.b, - &context->bw_ctx.dml, pipes, pipe_cnt); - - /* WM Set A */ - table_entry =3D &bw_params->wm_table.entries[WM_A]; - vlevel =3D MIN(vlevel_req, vlevel_max); - calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.a, - &context->bw_ctx.dml, pipes, pipe_cnt); -} - - -static bool dcn21_fast_validate_bw( - struct dc *dc, - struct dc_state *context, - display_e2e_pipe_params_st *pipes, - int *pipe_cnt_out, - int *pipe_split_from, - int *vlevel_out, - bool fast_validate) +bool dcn21_fast_validate_bw(struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *pipe_cnt_out, + int *pipe_split_from, + int *vlevel_out, + bool fast_validate) { bool out =3D false; int split[MAX_PIPES] =3D { 0 }; @@ -1197,7 +818,9 @@ static bool dcn21_fast_validate_bw( =20 dcn20_merge_pipes_for_validate(dc, context); =20 + DC_FP_START(); pipe_cnt =3D dc->res_pool->funcs->populate_dml_pipes(dc, context, pipes, = fast_validate); + DC_FP_END(); =20 *pipe_cnt_out =3D pipe_cnt; =20 @@ -1287,7 +910,9 @@ static bool dcn21_fast_validate_bw( hsplit_pipe =3D dcn20_find_secondary_pipe(dc, &context->res_ctx, dc->r= es_pool, pipe); ASSERT(hsplit_pipe); if (!hsplit_pipe) { - context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel][context->bw_ctx.dml.vb= a.maxMpcComb][pipe_idx] *=3D 2; + DC_FP_START(); + dcn20_fpu_adjust_dppclk(&context->bw_ctx.dml.vba, vlevel, context->bw= _ctx.dml.vba.maxMpcComb, pipe_idx, true); + DC_FP_END(); continue; } if (context->bw_ctx.dml.vba.ODMCombineEnabled[pipe_idx]) { @@ -1329,63 +954,6 @@ static bool dcn21_fast_validate_bw( return out; } =20 -static noinline bool dcn21_validate_bandwidth_fp(struct dc *dc, - struct dc_state *context, bool fast_validate) -{ - bool out =3D false; - - BW_VAL_TRACE_SETUP(); - - int vlevel =3D 0; - int pipe_split_from[MAX_PIPES]; - int pipe_cnt =3D 0; - display_e2e_pipe_params_st *pipes =3D kzalloc(dc->res_pool->pipe_count * = sizeof(display_e2e_pipe_params_st), GFP_ATOMIC); - DC_LOGGER_INIT(dc->ctx->logger); - - BW_VAL_TRACE_COUNT(); - - /*Unsafe due to current pipe merge and split logic*/ - ASSERT(context !=3D dc->current_state); - - out =3D dcn21_fast_validate_bw(dc, context, pipes, &pipe_cnt, pipe_split_= from, &vlevel, fast_validate); - - if (pipe_cnt =3D=3D 0) - goto validate_out; - - if (!out) - goto validate_fail; - - BW_VAL_TRACE_END_VOLTAGE_LEVEL(); - - if (fast_validate) { - BW_VAL_TRACE_SKIP(fast); - goto validate_out; - } - - dcn21_calculate_wm(dc, context, pipes, &pipe_cnt, pipe_split_from, vlevel= , fast_validate); - DC_FP_START(); - dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); - DC_FP_END(); - - BW_VAL_TRACE_END_WATERMARKS(); - - goto validate_out; - -validate_fail: - DC_LOG_WARNING("Mode Validation Warning: %s failed validation.\n", - dml_get_status_message(context->bw_ctx.dml.vba.ValidationStatus[context-= >bw_ctx.dml.vba.soc.num_states])); - - BW_VAL_TRACE_SKIP(fail); - out =3D false; - -validate_out: - kfree(pipes); - - BW_VAL_TRACE_FINISH(); - - return out; -} - /* * Some of the functions further below use the FPU, so we need to wrap this * with DC_FP_START()/DC_FP_END(). Use the same approach as for @@ -1560,94 +1128,6 @@ static struct display_stream_compressor *dcn21_dsc_c= reate(struct dc_context *ctx return &dsc->base; } =20 -static struct _vcs_dpi_voltage_scaling_st construct_low_pstate_lvl(struct = clk_limit_table *clk_table, unsigned int high_voltage_lvl) -{ - struct _vcs_dpi_voltage_scaling_st low_pstate_lvl; - int i; - - low_pstate_lvl.state =3D 1; - low_pstate_lvl.dcfclk_mhz =3D clk_table->entries[0].dcfclk_mhz; - low_pstate_lvl.fabricclk_mhz =3D clk_table->entries[0].fclk_mhz; - low_pstate_lvl.socclk_mhz =3D clk_table->entries[0].socclk_mhz; - low_pstate_lvl.dram_speed_mts =3D clk_table->entries[0].memclk_mhz * 2; - - low_pstate_lvl.dispclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].= dispclk_mhz; - low_pstate_lvl.dppclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].d= ppclk_mhz; - low_pstate_lvl.dram_bw_per_chan_gbps =3D dcn2_1_soc.clock_limits[high_vol= tage_lvl].dram_bw_per_chan_gbps; - low_pstate_lvl.dscclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].d= scclk_mhz; - low_pstate_lvl.dtbclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].d= tbclk_mhz; - low_pstate_lvl.phyclk_d18_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lv= l].phyclk_d18_mhz; - low_pstate_lvl.phyclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].p= hyclk_mhz; - - for (i =3D clk_table->num_entries; i > 1; i--) - clk_table->entries[i] =3D clk_table->entries[i-1]; - clk_table->entries[1] =3D clk_table->entries[0]; - clk_table->num_entries++; - - return low_pstate_lvl; -} - -static void update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw= _params) -{ - struct dcn21_resource_pool *pool =3D TO_DCN21_RES_POOL(dc->res_pool); - struct clk_limit_table *clk_table =3D &bw_params->clk_table; - struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; - unsigned int i, closest_clk_lvl =3D 0, k =3D 0; - int j; - - dcn2_1_ip.max_num_otg =3D pool->base.res_cap->num_timing_generator; - dcn2_1_ip.max_num_dpp =3D pool->base.pipe_count; - dcn2_1_soc.num_chans =3D bw_params->num_channels; - - ASSERT(clk_table->num_entries); - /* Copy dcn2_1_soc.clock_limits to clock_limits to avoid copying over nul= l states later */ - for (i =3D 0; i < dcn2_1_soc.num_states + 1; i++) { - clock_limits[i] =3D dcn2_1_soc.clock_limits[i]; - } - - for (i =3D 0; i < clk_table->num_entries; i++) { - /* loop backwards*/ - for (closest_clk_lvl =3D 0, j =3D dcn2_1_soc.num_states - 1; j >=3D 0; j= --) { - if ((unsigned int) dcn2_1_soc.clock_limits[j].dcfclk_mhz <=3D clk_table= ->entries[i].dcfclk_mhz) { - closest_clk_lvl =3D j; - break; - } - } - - /* clk_table[1] is reserved for min DF PState. skip here to fill in lat= er. */ - if (i =3D=3D 1) - k++; - - clock_limits[k].state =3D k; - clock_limits[k].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; - clock_limits[k].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; - clock_limits[k].socclk_mhz =3D clk_table->entries[i].socclk_mhz; - clock_limits[k].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2; - - clock_limits[k].dispclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl]= .dispclk_mhz; - clock_limits[k].dppclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= dppclk_mhz; - clock_limits[k].dram_bw_per_chan_gbps =3D dcn2_1_soc.clock_limits[closes= t_clk_lvl].dram_bw_per_chan_gbps; - clock_limits[k].dscclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= dscclk_mhz; - clock_limits[k].dtbclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= dtbclk_mhz; - clock_limits[k].phyclk_d18_mhz =3D dcn2_1_soc.clock_limits[closest_clk_l= vl].phyclk_d18_mhz; - clock_limits[k].phyclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= phyclk_mhz; - - k++; - } - for (i =3D 0; i < clk_table->num_entries + 1; i++) - dcn2_1_soc.clock_limits[i] =3D clock_limits[i]; - if (clk_table->num_entries) { - dcn2_1_soc.num_states =3D clk_table->num_entries + 1; - /* fill in min DF PState */ - dcn2_1_soc.clock_limits[1] =3D construct_low_pstate_lvl(clk_table, close= st_clk_lvl); - /* duplicate last level */ - dcn2_1_soc.clock_limits[dcn2_1_soc.num_states] =3D dcn2_1_soc.clock_limi= ts[dcn2_1_soc.num_states - 1]; - dcn2_1_soc.clock_limits[dcn2_1_soc.num_states].state =3D dcn2_1_soc.num_= states; - } - - dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21); -} - static struct pp_smu_funcs *dcn21_pp_smu_create(struct dc_context *ctx) { struct pp_smu_funcs *pp_smu =3D kzalloc(sizeof(*pp_smu), GFP_KERNEL); @@ -1898,24 +1378,6 @@ static uint32_t read_pipe_fuses(struct dc_context *c= tx) return value; } =20 -static int dcn21_populate_dml_pipes_from_context( - struct dc *dc, - struct dc_state *context, - display_e2e_pipe_params_st *pipes, - bool fast_validate) -{ - uint32_t pipe_cnt =3D dcn20_populate_dml_pipes_from_context(dc, context, = pipes, fast_validate); - int i; - - for (i =3D 0; i < pipe_cnt; i++) { - - pipes[i].pipe.src.hostvm =3D dc->res_pool->hubbub->riommu_active; - pipes[i].pipe.src.gpuvm =3D 1; - } - - return pipe_cnt; -} - static enum dc_status dcn21_patch_unknown_plane_state(struct dc_plane_stat= e *plane_state) { enum dc_status result =3D DC_OK; @@ -1943,7 +1405,7 @@ static const struct resource_funcs dcn21_res_pool_fun= cs =3D { .patch_unknown_plane_state =3D dcn21_patch_unknown_plane_state, .set_mcif_arb_params =3D dcn20_set_mcif_arb_params, .find_first_free_match_stream_enc_for_link =3D dcn10_find_first_free_matc= h_stream_enc_for_link, - .update_bw_bounding_box =3D update_bw_bounding_box + .update_bw_bounding_box =3D dcn21_update_bw_bounding_box, }; =20 static bool dcn21_resource_construct( diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h b/driver= s/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h index a27355171bca..f7ecc002c2f7 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.h @@ -35,11 +35,22 @@ struct dc; struct resource_pool; struct _vcs_dpi_display_pipe_params_st; =20 +extern struct _vcs_dpi_ip_params_st dcn2_1_ip; +extern struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc; + struct dcn21_resource_pool { struct resource_pool base; }; struct resource_pool *dcn21_create_resource_pool( const struct dc_init_data *init_data, struct dc *dc); +bool dcn21_fast_validate_bw( + struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *pipe_cnt_out, + int *pipe_split_from, + int *vlevel_out, + bool fast_validate); =20 #endif /* _DCN21_RESOURCE_H_ */ diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c b/drivers= /gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c index b7adc9b6a543..bfdb4b78f571 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c @@ -27,10 +27,21 @@ #include "resource.h" #include "clk_mgr.h" #include "dc_link_dp.h" +#include "dchubbub.h" #include "dcn20/dcn20_resource.h" +#include "dcn21/dcn21_resource.h" =20 #include "dcn20_fpu.h" =20 +#define DC_LOGGER_INIT(logger) + +#ifndef MAX +#define MAX(X, Y) ((X) > (Y) ? (X) : (Y)) +#endif +#ifndef MIN +#define MIN(X, Y) ((X) < (Y) ? (X) : (Y)) +#endif + /** * DOC: DCN2x FPU manipulation Overview * @@ -426,7 +437,219 @@ struct _vcs_dpi_soc_bounding_box_st dcn2_0_nv14_soc = =3D { =20 struct _vcs_dpi_soc_bounding_box_st dcn2_0_nv12_soc =3D { 0 }; =20 -#define DC_LOGGER_INIT(logger) +struct _vcs_dpi_ip_params_st dcn2_1_ip =3D { + .odm_capable =3D 1, + .gpuvm_enable =3D 1, + .hostvm_enable =3D 1, + .gpuvm_max_page_table_levels =3D 1, + .hostvm_max_page_table_levels =3D 4, + .hostvm_cached_page_table_levels =3D 2, + .num_dsc =3D 3, + .rob_buffer_size_kbytes =3D 168, + .det_buffer_size_kbytes =3D 164, + .dpte_buffer_size_in_pte_reqs_luma =3D 44, + .dpte_buffer_size_in_pte_reqs_chroma =3D 42,//todo + .dpp_output_buffer_pixels =3D 2560, + .opp_output_buffer_lines =3D 1, + .pixel_chunk_size_kbytes =3D 8, + .pte_enable =3D 1, + .max_page_table_levels =3D 4, + .pte_chunk_size_kbytes =3D 2, + .meta_chunk_size_kbytes =3D 2, + .min_meta_chunk_size_bytes =3D 256, + .writeback_chunk_size_kbytes =3D 2, + .line_buffer_size_bits =3D 789504, + .is_line_buffer_bpp_fixed =3D 0, + .line_buffer_fixed_bpp =3D 0, + .dcc_supported =3D true, + .max_line_buffer_lines =3D 12, + .writeback_luma_buffer_size_kbytes =3D 12, + .writeback_chroma_buffer_size_kbytes =3D 8, + .writeback_chroma_line_buffer_width_pixels =3D 4, + .writeback_max_hscl_ratio =3D 1, + .writeback_max_vscl_ratio =3D 1, + .writeback_min_hscl_ratio =3D 1, + .writeback_min_vscl_ratio =3D 1, + .writeback_max_hscl_taps =3D 12, + .writeback_max_vscl_taps =3D 12, + .writeback_line_buffer_luma_buffer_size =3D 0, + .writeback_line_buffer_chroma_buffer_size =3D 14643, + .cursor_buffer_size =3D 8, + .cursor_chunk_size =3D 2, + .max_num_otg =3D 4, + .max_num_dpp =3D 4, + .max_num_wb =3D 1, + .max_dchub_pscl_bw_pix_per_clk =3D 4, + .max_pscl_lb_bw_pix_per_clk =3D 2, + .max_lb_vscl_bw_pix_per_clk =3D 4, + .max_vscl_hscl_bw_pix_per_clk =3D 4, + .max_hscl_ratio =3D 4, + .max_vscl_ratio =3D 4, + .hscl_mults =3D 4, + .vscl_mults =3D 4, + .max_hscl_taps =3D 8, + .max_vscl_taps =3D 8, + .dispclk_ramp_margin_percent =3D 1, + .underscan_factor =3D 1.10, + .min_vblank_lines =3D 32, // + .dppclk_delay_subtotal =3D 77, // + .dppclk_delay_scl_lb_only =3D 16, + .dppclk_delay_scl =3D 50, + .dppclk_delay_cnvc_formatter =3D 8, + .dppclk_delay_cnvc_cursor =3D 6, + .dispclk_delay_subtotal =3D 87, // + .dcfclk_cstate_latency =3D 10, // SRExitTime + .max_inter_dcn_tile_repeaters =3D 8, + + .xfc_supported =3D false, + .xfc_fill_bw_overhead_percent =3D 10.0, + .xfc_fill_constant_bytes =3D 0, + .ptoi_supported =3D 0, + .number_of_cursors =3D 1, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn2_1_soc =3D { + .clock_limits =3D { + { + .state =3D 0, + .dcfclk_mhz =3D 400.0, + .fabricclk_mhz =3D 400.0, + .dispclk_mhz =3D 600.0, + .dppclk_mhz =3D 400.00, + .phyclk_mhz =3D 600.0, + .socclk_mhz =3D 278.0, + .dscclk_mhz =3D 205.67, + .dram_speed_mts =3D 1600.0, + }, + { + .state =3D 1, + .dcfclk_mhz =3D 464.52, + .fabricclk_mhz =3D 800.0, + .dispclk_mhz =3D 654.55, + .dppclk_mhz =3D 626.09, + .phyclk_mhz =3D 600.0, + .socclk_mhz =3D 278.0, + .dscclk_mhz =3D 205.67, + .dram_speed_mts =3D 1600.0, + }, + { + .state =3D 2, + .dcfclk_mhz =3D 514.29, + .fabricclk_mhz =3D 933.0, + .dispclk_mhz =3D 757.89, + .dppclk_mhz =3D 685.71, + .phyclk_mhz =3D 600.0, + .socclk_mhz =3D 278.0, + .dscclk_mhz =3D 287.67, + .dram_speed_mts =3D 1866.0, + }, + { + .state =3D 3, + .dcfclk_mhz =3D 576.00, + .fabricclk_mhz =3D 1067.0, + .dispclk_mhz =3D 847.06, + .dppclk_mhz =3D 757.89, + .phyclk_mhz =3D 600.0, + .socclk_mhz =3D 715.0, + .dscclk_mhz =3D 318.334, + .dram_speed_mts =3D 2134.0, + }, + { + .state =3D 4, + .dcfclk_mhz =3D 626.09, + .fabricclk_mhz =3D 1200.0, + .dispclk_mhz =3D 900.00, + .dppclk_mhz =3D 847.06, + .phyclk_mhz =3D 810.0, + .socclk_mhz =3D 953.0, + .dscclk_mhz =3D 489.0, + .dram_speed_mts =3D 2400.0, + }, + { + .state =3D 5, + .dcfclk_mhz =3D 685.71, + .fabricclk_mhz =3D 1333.0, + .dispclk_mhz =3D 1028.57, + .dppclk_mhz =3D 960.00, + .phyclk_mhz =3D 810.0, + .socclk_mhz =3D 278.0, + .dscclk_mhz =3D 287.67, + .dram_speed_mts =3D 2666.0, + }, + { + .state =3D 6, + .dcfclk_mhz =3D 757.89, + .fabricclk_mhz =3D 1467.0, + .dispclk_mhz =3D 1107.69, + .dppclk_mhz =3D 1028.57, + .phyclk_mhz =3D 810.0, + .socclk_mhz =3D 715.0, + .dscclk_mhz =3D 318.334, + .dram_speed_mts =3D 3200.0, + }, + { + .state =3D 7, + .dcfclk_mhz =3D 847.06, + .fabricclk_mhz =3D 1600.0, + .dispclk_mhz =3D 1395.0, + .dppclk_mhz =3D 1285.00, + .phyclk_mhz =3D 1325.0, + .socclk_mhz =3D 953.0, + .dscclk_mhz =3D 489.0, + .dram_speed_mts =3D 4266.0, + }, + /*Extra state, no dispclk ramping*/ + { + .state =3D 8, + .dcfclk_mhz =3D 847.06, + .fabricclk_mhz =3D 1600.0, + .dispclk_mhz =3D 1395.0, + .dppclk_mhz =3D 1285.0, + .phyclk_mhz =3D 1325.0, + .socclk_mhz =3D 953.0, + .dscclk_mhz =3D 489.0, + .dram_speed_mts =3D 4266.0, + }, + + }, + + .sr_exit_time_us =3D 12.5, + .sr_enter_plus_exit_time_us =3D 17.0, + .urgent_latency_us =3D 4.0, + .urgent_latency_pixel_data_only_us =3D 4.0, + .urgent_latency_pixel_mixed_with_vm_data_us =3D 4.0, + .urgent_latency_vm_data_only_us =3D 4.0, + .urgent_out_of_order_return_per_channel_pixel_only_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_pixel_and_vm_bytes =3D 4096, + .urgent_out_of_order_return_per_channel_vm_only_bytes =3D 4096, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_only =3D 80.0, + .pct_ideal_dram_sdp_bw_after_urgent_pixel_and_vm =3D 75.0, + .pct_ideal_dram_sdp_bw_after_urgent_vm_only =3D 40.0, + .max_avg_sdp_bw_use_normal_percent =3D 60.0, + .max_avg_dram_bw_use_normal_percent =3D 100.0, + .writeback_latency_us =3D 12.0, + .max_request_size_bytes =3D 256, + .dram_channel_width_bytes =3D 4, + .fabric_datapath_to_dcn_data_return_bytes =3D 32, + .dcn_downspread_percent =3D 0.5, + .downspread_percent =3D 0.38, + .dram_page_open_time_ns =3D 50.0, + .dram_rw_turnaround_time_ns =3D 17.5, + .dram_return_buffer_per_channel_bytes =3D 8192, + .round_trip_ping_latency_dcfclk_cycles =3D 128, + .urgent_out_of_order_return_per_channel_bytes =3D 4096, + .channel_interleave_bytes =3D 256, + .num_banks =3D 8, + .num_chans =3D 4, + .vmm_page_size_bytes =3D 4096, + .dram_clock_change_latency_us =3D 23.84, + .return_bus_width_bytes =3D 64, + .dispclk_dppclk_vco_speed_mhz =3D 3600, + .xfc_bus_transport_time_us =3D 4, + .xfc_xbuf_latency_tolerance_us =3D 4, + .use_urgent_burst_bw =3D 1, + .num_states =3D 8 +}; =20 void dcn20_populate_dml_writeback_from_context(struct dc *dc, struct resource_context *res_ctx, @@ -1485,3 +1708,316 @@ void dcn20_fpu_adjust_dppclk(struct vba_vars_st *v, else v->RequiredDPPCLK[vlevel][max_mpc_comb][pipe_idx] /=3D 2; } + +int dcn21_populate_dml_pipes_from_context(struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + bool fast_validate) +{ + uint32_t pipe_cnt; + int i; + + dc_assert_fp_enabled(); + + pipe_cnt =3D dcn20_populate_dml_pipes_from_context(dc, context, pipes, fa= st_validate); + + for (i =3D 0; i < pipe_cnt; i++) { + + pipes[i].pipe.src.hostvm =3D dc->res_pool->hubbub->riommu_active; + pipes[i].pipe.src.gpuvm =3D 1; + } + + return pipe_cnt; +} + +static void patch_bounding_box(struct dc *dc, struct _vcs_dpi_soc_bounding= _box_st *bb) +{ + int i; + + if (dc->bb_overrides.sr_exit_time_ns) { + for (i =3D 0; i < WM_SET_COUNT; i++) { + dc->clk_mgr->bw_params->wm_table.entries[i].sr_exit_time_us =3D + dc->bb_overrides.sr_exit_time_ns / 1000.0; + } + } + + if (dc->bb_overrides.sr_enter_plus_exit_time_ns) { + for (i =3D 0; i < WM_SET_COUNT; i++) { + dc->clk_mgr->bw_params->wm_table.entries[i].sr_enter_plus_exit_time_u= s =3D + dc->bb_overrides.sr_enter_plus_exit_time_ns / 1000.0; + } + } + + if (dc->bb_overrides.urgent_latency_ns) { + bb->urgent_latency_us =3D dc->bb_overrides.urgent_latency_ns / 1000.0; + } + + if (dc->bb_overrides.dram_clock_change_latency_ns) { + for (i =3D 0; i < WM_SET_COUNT; i++) { + dc->clk_mgr->bw_params->wm_table.entries[i].pstate_latency_us =3D + dc->bb_overrides.dram_clock_change_latency_ns / 1000.0; + } + } +} + +static void calculate_wm_set_for_vlevel(int vlevel, + struct wm_range_table_entry *table_entry, + struct dcn_watermarks *wm_set, + struct display_mode_lib *dml, + display_e2e_pipe_params_st *pipes, + int pipe_cnt) +{ + double dram_clock_change_latency_cached =3D dml->soc.dram_clock_change_la= tency_us; + + ASSERT(vlevel < dml->soc.num_states); + /* only pipe 0 is read for voltage and dcf/soc clocks */ + pipes[0].clks_cfg.voltage =3D vlevel; + pipes[0].clks_cfg.dcfclk_mhz =3D dml->soc.clock_limits[vlevel].dcfclk_mhz; + pipes[0].clks_cfg.socclk_mhz =3D dml->soc.clock_limits[vlevel].socclk_mhz; + + dml->soc.dram_clock_change_latency_us =3D table_entry->pstate_latency_us; + dml->soc.sr_exit_time_us =3D table_entry->sr_exit_time_us; + dml->soc.sr_enter_plus_exit_time_us =3D table_entry->sr_enter_plus_exit_t= ime_us; + + wm_set->urgent_ns =3D get_wm_urgent(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.cstate_enter_plus_exit_ns =3D get_wm_stutter_enter_= exit(dml, pipes, pipe_cnt) * 1000; + wm_set->cstate_pstate.cstate_exit_ns =3D get_wm_stutter_exit(dml, pipes, = pipe_cnt) * 1000; + wm_set->cstate_pstate.pstate_change_ns =3D get_wm_dram_clock_change(dml, = pipes, pipe_cnt) * 1000; + wm_set->pte_meta_urgent_ns =3D get_wm_memory_trip(dml, pipes, pipe_cnt) *= 1000; + wm_set->frac_urg_bw_nom =3D get_fraction_of_urgent_bandwidth(dml, pipes, = pipe_cnt) * 1000; + wm_set->frac_urg_bw_flip =3D get_fraction_of_urgent_bandwidth_imm_flip(dm= l, pipes, pipe_cnt) * 1000; + wm_set->urgent_latency_ns =3D get_urgent_latency(dml, pipes, pipe_cnt) * = 1000; + dml->soc.dram_clock_change_latency_us =3D dram_clock_change_latency_cache= d; +} + +static void dcn21_calculate_wm(struct dc *dc, struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *out_pipe_cnt, + int *pipe_split_from, + int vlevel_req, + bool fast_validate) +{ + int pipe_cnt, i, pipe_idx; + int vlevel, vlevel_max; + struct wm_range_table_entry *table_entry; + struct clk_bw_params *bw_params =3D dc->clk_mgr->bw_params; + + ASSERT(bw_params); + + patch_bounding_box(dc, &context->bw_ctx.dml.soc); + + for (i =3D 0, pipe_idx =3D 0, pipe_cnt =3D 0; i < dc->res_pool->pipe_coun= t; i++) { + if (!context->res_ctx.pipe_ctx[i].stream) + continue; + + pipes[pipe_cnt].clks_cfg.refclk_mhz =3D dc->res_pool->ref_clocks.dchub_= ref_clock_inKhz / 1000.0; + pipes[pipe_cnt].clks_cfg.dispclk_mhz =3D context->bw_ctx.dml.vba.Requir= edDISPCLK[vlevel_req][context->bw_ctx.dml.vba.maxMpcComb]; + + if (pipe_split_from[i] < 0) { + pipes[pipe_cnt].clks_cfg.dppclk_mhz =3D + context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.d= ml.vba.maxMpcComb][pipe_idx]; + if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_idx] =3D=3D pipe_id= x) + pipes[pipe_cnt].pipe.dest.odm_combine =3D + context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_i= dx]; + else + pipes[pipe_cnt].pipe.dest.odm_combine =3D 0; + pipe_idx++; + } else { + pipes[pipe_cnt].clks_cfg.dppclk_mhz =3D + context->bw_ctx.dml.vba.RequiredDPPCLK[vlevel_req][context->bw_ctx.d= ml.vba.maxMpcComb][pipe_split_from[i]]; + if (context->bw_ctx.dml.vba.BlendingAndTiming[pipe_split_from[i]] =3D= =3D pipe_split_from[i]) + pipes[pipe_cnt].pipe.dest.odm_combine =3D + context->bw_ctx.dml.vba.ODMCombineEnablePerState[vlevel_req][pipe_s= plit_from[i]]; + else + pipes[pipe_cnt].pipe.dest.odm_combine =3D 0; + } + pipe_cnt++; + } + + if (pipe_cnt !=3D pipe_idx) { + if (dc->res_pool->funcs->populate_dml_pipes) + pipe_cnt =3D dc->res_pool->funcs->populate_dml_pipes(dc, + context, pipes, fast_validate); + else + pipe_cnt =3D dcn21_populate_dml_pipes_from_context(dc, + context, pipes, fast_validate); + } + + *out_pipe_cnt =3D pipe_cnt; + + vlevel_max =3D bw_params->clk_table.num_entries - 1; + + + /* WM Set D */ + table_entry =3D &bw_params->wm_table.entries[WM_D]; + if (table_entry->wm_type =3D=3D WM_TYPE_RETRAINING) + vlevel =3D 0; + else + vlevel =3D vlevel_max; + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.d, + &context->bw_ctx.dml, pipes, pipe_cnt); + /* WM Set C */ + table_entry =3D &bw_params->wm_table.entries[WM_C]; + vlevel =3D MIN(MAX(vlevel_req, 3), vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.c, + &context->bw_ctx.dml, pipes, pipe_cnt); + /* WM Set B */ + table_entry =3D &bw_params->wm_table.entries[WM_B]; + vlevel =3D MIN(MAX(vlevel_req, 2), vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.b, + &context->bw_ctx.dml, pipes, pipe_cnt); + + /* WM Set A */ + table_entry =3D &bw_params->wm_table.entries[WM_A]; + vlevel =3D MIN(vlevel_req, vlevel_max); + calculate_wm_set_for_vlevel(vlevel, table_entry, &context->bw_ctx.bw.dcn.= watermarks.a, + &context->bw_ctx.dml, pipes, pipe_cnt); +} + +bool dcn21_validate_bandwidth_fp(struct dc *dc, + struct dc_state *context, + bool fast_validate) +{ + bool out =3D false; + + BW_VAL_TRACE_SETUP(); + + int vlevel =3D 0; + int pipe_split_from[MAX_PIPES]; + int pipe_cnt =3D 0; + display_e2e_pipe_params_st *pipes =3D kzalloc(dc->res_pool->pipe_count * = sizeof(display_e2e_pipe_params_st), GFP_ATOMIC); + DC_LOGGER_INIT(dc->ctx->logger); + + BW_VAL_TRACE_COUNT(); + + dc_assert_fp_enabled(); + + /*Unsafe due to current pipe merge and split logic*/ + ASSERT(context !=3D dc->current_state); + + out =3D dcn21_fast_validate_bw(dc, context, pipes, &pipe_cnt, pipe_split_= from, &vlevel, fast_validate); + + if (pipe_cnt =3D=3D 0) + goto validate_out; + + if (!out) + goto validate_fail; + + BW_VAL_TRACE_END_VOLTAGE_LEVEL(); + + if (fast_validate) { + BW_VAL_TRACE_SKIP(fast); + goto validate_out; + } + + dcn21_calculate_wm(dc, context, pipes, &pipe_cnt, pipe_split_from, vlevel= , fast_validate); + dcn20_calculate_dlg_params(dc, context, pipes, pipe_cnt, vlevel); + + BW_VAL_TRACE_END_WATERMARKS(); + + goto validate_out; + +validate_fail: + DC_LOG_WARNING("Mode Validation Warning: %s failed validation.\n", + dml_get_status_message(context->bw_ctx.dml.vba.ValidationStatus[context= ->bw_ctx.dml.vba.soc.num_states])); + + BW_VAL_TRACE_SKIP(fail); + out =3D false; + +validate_out: + kfree(pipes); + + BW_VAL_TRACE_FINISH(); + + return out; +} + +static struct _vcs_dpi_voltage_scaling_st construct_low_pstate_lvl(struct = clk_limit_table *clk_table, unsigned int high_voltage_lvl) +{ + struct _vcs_dpi_voltage_scaling_st low_pstate_lvl; + int i; + + low_pstate_lvl.state =3D 1; + low_pstate_lvl.dcfclk_mhz =3D clk_table->entries[0].dcfclk_mhz; + low_pstate_lvl.fabricclk_mhz =3D clk_table->entries[0].fclk_mhz; + low_pstate_lvl.socclk_mhz =3D clk_table->entries[0].socclk_mhz; + low_pstate_lvl.dram_speed_mts =3D clk_table->entries[0].memclk_mhz * 2; + + low_pstate_lvl.dispclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].= dispclk_mhz; + low_pstate_lvl.dppclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].d= ppclk_mhz; + low_pstate_lvl.dram_bw_per_chan_gbps =3D dcn2_1_soc.clock_limits[high_vol= tage_lvl].dram_bw_per_chan_gbps; + low_pstate_lvl.dscclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].d= scclk_mhz; + low_pstate_lvl.dtbclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].d= tbclk_mhz; + low_pstate_lvl.phyclk_d18_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lv= l].phyclk_d18_mhz; + low_pstate_lvl.phyclk_mhz =3D dcn2_1_soc.clock_limits[high_voltage_lvl].p= hyclk_mhz; + + for (i =3D clk_table->num_entries; i > 1; i--) + clk_table->entries[i] =3D clk_table->entries[i-1]; + clk_table->entries[1] =3D clk_table->entries[0]; + clk_table->num_entries++; + + return low_pstate_lvl; +} + +void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params) +{ + struct dcn21_resource_pool *pool =3D TO_DCN21_RES_POOL(dc->res_pool); + struct clk_limit_table *clk_table =3D &bw_params->clk_table; + struct _vcs_dpi_voltage_scaling_st clock_limits[DC__VOLTAGE_STATES]; + unsigned int i, closest_clk_lvl =3D 0, k =3D 0; + int j; + + dc_assert_fp_enabled(); + + dcn2_1_ip.max_num_otg =3D pool->base.res_cap->num_timing_generator; + dcn2_1_ip.max_num_dpp =3D pool->base.pipe_count; + dcn2_1_soc.num_chans =3D bw_params->num_channels; + + ASSERT(clk_table->num_entries); + /* Copy dcn2_1_soc.clock_limits to clock_limits to avoid copying over nul= l states later */ + for (i =3D 0; i < dcn2_1_soc.num_states + 1; i++) { + clock_limits[i] =3D dcn2_1_soc.clock_limits[i]; + } + + for (i =3D 0; i < clk_table->num_entries; i++) { + /* loop backwards*/ + for (closest_clk_lvl =3D 0, j =3D dcn2_1_soc.num_states - 1; j >=3D 0; j= --) { + if ((unsigned int) dcn2_1_soc.clock_limits[j].dcfclk_mhz <=3D clk_table= ->entries[i].dcfclk_mhz) { + closest_clk_lvl =3D j; + break; + } + } + + /* clk_table[1] is reserved for min DF PState. skip here to fill in lat= er. */ + if (i =3D=3D 1) + k++; + + clock_limits[k].state =3D k; + clock_limits[k].dcfclk_mhz =3D clk_table->entries[i].dcfclk_mhz; + clock_limits[k].fabricclk_mhz =3D clk_table->entries[i].fclk_mhz; + clock_limits[k].socclk_mhz =3D clk_table->entries[i].socclk_mhz; + clock_limits[k].dram_speed_mts =3D clk_table->entries[i].memclk_mhz * 2; + + clock_limits[k].dispclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl]= .dispclk_mhz; + clock_limits[k].dppclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= dppclk_mhz; + clock_limits[k].dram_bw_per_chan_gbps =3D dcn2_1_soc.clock_limits[closes= t_clk_lvl].dram_bw_per_chan_gbps; + clock_limits[k].dscclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= dscclk_mhz; + clock_limits[k].dtbclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= dtbclk_mhz; + clock_limits[k].phyclk_d18_mhz =3D dcn2_1_soc.clock_limits[closest_clk_l= vl].phyclk_d18_mhz; + clock_limits[k].phyclk_mhz =3D dcn2_1_soc.clock_limits[closest_clk_lvl].= phyclk_mhz; + + k++; + } + for (i =3D 0; i < clk_table->num_entries + 1; i++) + dcn2_1_soc.clock_limits[i] =3D clock_limits[i]; + if (clk_table->num_entries) { + dcn2_1_soc.num_states =3D clk_table->num_entries + 1; + /* fill in min DF PState */ + dcn2_1_soc.clock_limits[1] =3D construct_low_pstate_lvl(clk_table, close= st_clk_lvl); + /* duplicate last level */ + dcn2_1_soc.clock_limits[dcn2_1_soc.num_states] =3D dcn2_1_soc.clock_limi= ts[dcn2_1_soc.num_states - 1]; + dcn2_1_soc.clock_limits[dcn2_1_soc.num_states].state =3D dcn2_1_soc.num_= states; + } + + dml_init_instance(&dc->dml, &dcn2_1_soc, &dcn2_1_ip, DML_PROJECT_DCN21); +} diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h b/drivers= /gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h index 6b1f4126bc88..da38fa10f077 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.h @@ -71,4 +71,13 @@ void dcn20_fpu_adjust_dppclk(struct vba_vars_st *v, int max_mpc_comb, int pipe_idx, bool is_validating_bw); + +int dcn21_populate_dml_pipes_from_context(struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + bool fast_validate); +bool dcn21_validate_bandwidth_fp(struct dc *dc, + struct dc_state *context, + bool fast_validate); +void dcn21_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_= params); #endif /* __DCN20_FPU_H__ */ --=20 2.34.1 From nobody Tue Jun 23 18:20:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38EAEC433F5 for ; Mon, 28 Feb 2022 21:12:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230074AbiB1VMi (ORCPT ); Mon, 28 Feb 2022 16:12:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230061AbiB1VMg (ORCPT ); Mon, 28 Feb 2022 16:12:36 -0500 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6228496A0 for ; Mon, 28 Feb 2022 13:11:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=hlxuHAMRU3IMg6hj7WJxf5M11wcx0ijnT9+HHKNCJmY=; b=obQcFpsBHStgolqorqjWeQ4+Bo dXzFLLVuEKqbAFgwEib3HJ8yHcdiLk6GQafUZkwZCbg311p9psRwTkoiDNGZDzqVwgH4mB43qAffI wgkft3zSyMGjVhI7uN5KL5KXbp8P9Lyk2GVEDopiXcD/wu4gjBIjXGS6wuySB3aXBh7FAjpO69o1I OX/VrbntjRoQ5w+G5ML18oJnElP9XyxqL0+ywMPy+pl9h5t0K4kN3MU/7RDyvR371hCIUJBza2Deu B6bMOXgiOfvhwSrvMtaR+t7PNs39lSB+fs7FS4ViQq72Ph1zV4PNQ/z+3EeEbGzQkvXg/jlYCoxSb lePgc7Mw==; Received: from [165.90.126.25] (helo=killbill.home) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1nOnJ7-000CFy-Gv; Mon, 28 Feb 2022 22:11:49 +0100 From: Melissa Wen To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, harry.wentland@amd.com, sunpeng.li@amd.com, Rodrigo.Siqueira@amd.com, alexander.deucher@amd.com, christian.koenig@amd.com, Xinhui.Pan@amd.com, airlied@linux.ie, daniel@ffwll.ch Cc: Dmytro Laktyushkin , Jasdeep Dhillon , Qingqing Zhuo , Melissa Wen , linux-kernel@vger.kernel.org Subject: [PATCH 2/2] drm/amd/display: move FPU code from dcn10 to dml/dcn10 folder Date: Mon, 28 Feb 2022 20:10:47 -0100 Message-Id: <20220228211047.3957945-3-mwen@igalia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220228211047.3957945-1-mwen@igalia.com> References: <20220228211047.3957945-1-mwen@igalia.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" FPU operations in dcn10 was already moved to dml folder via calcs code. However, dcn1_0_ip and dcn_1_0_soc with FPU componentd remains on dcn10. Following previous changes to isolate FPU, this patch creates dcn10_fpu files to isolate FPU-specific code and moves those structs to it. Signed-off-by: Melissa Wen --- .../drm/amd/display/dc/dcn10/dcn10_resource.c | 62 --------- .../drm/amd/display/dc/dcn10/dcn10_resource.h | 4 + drivers/gpu/drm/amd/display/dc/dml/Makefile | 2 + .../drm/amd/display/dc/dml/dcn10/dcn10_fpu.c | 124 ++++++++++++++++++ .../drm/amd/display/dc/dml/dcn10/dcn10_fpu.h | 30 +++++ 5 files changed, 160 insertions(+), 62 deletions(-) create mode 100644 drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c create mode 100644 drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c b/driver= s/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c index 858b72149897..ac96242cc474 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c @@ -70,68 +70,6 @@ #include "dce/dce_aux.h" #include "dce/dce_i2c.h" =20 -const struct _vcs_dpi_ip_params_st dcn1_0_ip =3D { - .rob_buffer_size_kbytes =3D 64, - .det_buffer_size_kbytes =3D 164, - .dpte_buffer_size_in_pte_reqs_luma =3D 42, - .dpp_output_buffer_pixels =3D 2560, - .opp_output_buffer_lines =3D 1, - .pixel_chunk_size_kbytes =3D 8, - .pte_enable =3D 1, - .pte_chunk_size_kbytes =3D 2, - .meta_chunk_size_kbytes =3D 2, - .writeback_chunk_size_kbytes =3D 2, - .line_buffer_size_bits =3D 589824, - .max_line_buffer_lines =3D 12, - .IsLineBufferBppFixed =3D 0, - .LineBufferFixedBpp =3D -1, - .writeback_luma_buffer_size_kbytes =3D 12, - .writeback_chroma_buffer_size_kbytes =3D 8, - .max_num_dpp =3D 4, - .max_num_wb =3D 2, - .max_dchub_pscl_bw_pix_per_clk =3D 4, - .max_pscl_lb_bw_pix_per_clk =3D 2, - .max_lb_vscl_bw_pix_per_clk =3D 4, - .max_vscl_hscl_bw_pix_per_clk =3D 4, - .max_hscl_ratio =3D 4, - .max_vscl_ratio =3D 4, - .hscl_mults =3D 4, - .vscl_mults =3D 4, - .max_hscl_taps =3D 8, - .max_vscl_taps =3D 8, - .dispclk_ramp_margin_percent =3D 1, - .underscan_factor =3D 1.10, - .min_vblank_lines =3D 14, - .dppclk_delay_subtotal =3D 90, - .dispclk_delay_subtotal =3D 42, - .dcfclk_cstate_latency =3D 10, - .max_inter_dcn_tile_repeaters =3D 8, - .can_vstartup_lines_exceed_vsync_plus_back_porch_lines_minus_one =3D 0, - .bug_forcing_LC_req_same_size_fixed =3D 0, -}; - -const struct _vcs_dpi_soc_bounding_box_st dcn1_0_soc =3D { - .sr_exit_time_us =3D 9.0, - .sr_enter_plus_exit_time_us =3D 11.0, - .urgent_latency_us =3D 4.0, - .writeback_latency_us =3D 12.0, - .ideal_dram_bw_after_urgent_percent =3D 80.0, - .max_request_size_bytes =3D 256, - .downspread_percent =3D 0.5, - .dram_page_open_time_ns =3D 50.0, - .dram_rw_turnaround_time_ns =3D 17.5, - .dram_return_buffer_per_channel_bytes =3D 8192, - .round_trip_ping_latency_dcfclk_cycles =3D 128, - .urgent_out_of_order_return_per_channel_bytes =3D 256, - .channel_interleave_bytes =3D 256, - .num_banks =3D 8, - .num_chans =3D 2, - .vmm_page_size_bytes =3D 4096, - .dram_clock_change_latency_us =3D 17.0, - .writeback_dram_clock_change_latency_us =3D 23.0, - .return_bus_width_bytes =3D 64, -}; - #ifndef mmDP0_DP_DPHY_INTERNAL_CTRL #define mmDP0_DP_DPHY_INTERNAL_CTRL 0x210f #define mmDP0_DP_DPHY_INTERNAL_CTRL_BASE_IDX 2 diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h b/driver= s/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h index 633025ccb870..bf8e33cd8147 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.h @@ -27,6 +27,7 @@ #define __DC_RESOURCE_DCN10_H__ =20 #include "core_types.h" +#include "dml/dcn10/dcn10_fpu.h" =20 #define TO_DCN10_RES_POOL(pool)\ container_of(pool, struct dcn10_resource_pool, base) @@ -35,6 +36,9 @@ struct dc; struct resource_pool; struct _vcs_dpi_display_pipe_params_st; =20 +extern struct _vcs_dpi_ip_params_st dcn1_0_ip; +extern struct _vcs_dpi_soc_bounding_box_st dcn1_0_soc; + struct dcn10_resource_pool { struct resource_pool base; }; diff --git a/drivers/gpu/drm/amd/display/dc/dml/Makefile b/drivers/gpu/drm/= amd/display/dc/dml/Makefile index b16c492593e2..6b7f8b62a56f 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/Makefile +++ b/drivers/gpu/drm/amd/display/dc/dml/Makefile @@ -58,6 +58,7 @@ CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_lib.o :=3D $(dml= _ccflags) =20 ifdef CONFIG_DRM_AMD_DC_DCN CFLAGS_$(AMDDALPATH)/dc/dml/display_mode_vba.o :=3D $(dml_ccflags) +CFLAGS_$(AMDDALPATH)/dc/dml/dcn10/dcn10_fpu.o :=3D $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/dcn20_fpu.o :=3D $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o :=3D $(dml_ccflags) CFLAGS_$(AMDDALPATH)/dc/dml/dcn20/display_mode_vba_20.o :=3D $(dml_ccflags) @@ -105,6 +106,7 @@ DML =3D calcs/dce_calcs.o calcs/custom_float.o calcs/bw= _fixed.o =20 ifdef CONFIG_DRM_AMD_DC_DCN DML +=3D display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_c= alc.o +DML +=3D dcn10/dcn10_fpu.o DML +=3D dcn20/dcn20_fpu.o DML +=3D display_mode_vba.o dcn20/display_rq_dlg_calc_20.o dcn20/display_m= ode_vba_20.o DML +=3D dcn20/display_rq_dlg_calc_20v2.o dcn20/display_mode_vba_20v2.o diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c b/drivers= /gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c new file mode 100644 index 000000000000..7f08f49eb7b1 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software= "), + * to deal in the Software without restriction, including without limitati= on + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "dcn10/dcn10_resource.h" + +#include "dcn10_fpu.h" + +/** + * DOC: DCN10 FPU manipulation Overview + * + * The DCN architecture relies on FPU operations, which require special + * compilation flags and the use of kernel_fpu_begin/end functions; ideall= y, we + * want to avoid spreading FPU access across multiple files. With this ide= a in + * mind, this file aims to centralize DCN10 functions that require FPU acc= ess + * in a single place. Code in this file follows the following code pattern: + * + * 1. Functions that use FPU operations should be isolated in static funct= ions. + * 2. The FPU functions should have the noinline attribute to ensure anyth= ing + * that deals with FP register is contained within this call. + * 3. All function that needs to be accessed outside this file requires a + * public interface that not uses any FPU reference. + * 4. Developers **must not** use DC_FP_START/END in this file, but they n= eed + * to ensure that the caller invokes it before access any function avai= lable + * in this file. For this reason, public functions in this file must in= voke + * dc_assert_fp_enabled(); + * + * Let's expand a little bit more the idea in the code pattern. To fully + * isolate FPU operations in a single place, we must avoid situations where + * compilers spill FP values to registers due to FP enable in a specific C + * file. Note that even if we isolate all FPU functions in a single file a= nd + * call its interface from other files, the compiler might enable the use = of + * FPU before we call DC_FP_START. Nevertheless, it is the programmer's + * responsibility to invoke DC_FP_START/END in the correct place. To highl= ight + * situations where developers forgot to use the FP protection before call= ing + * the DC FPU interface functions, we introduce a helper that checks if the + * function is invoked under FP protection. If not, it will trigger a kern= el + * warning. + */ + +struct _vcs_dpi_ip_params_st dcn1_0_ip =3D { + .rob_buffer_size_kbytes =3D 64, + .det_buffer_size_kbytes =3D 164, + .dpte_buffer_size_in_pte_reqs_luma =3D 42, + .dpp_output_buffer_pixels =3D 2560, + .opp_output_buffer_lines =3D 1, + .pixel_chunk_size_kbytes =3D 8, + .pte_enable =3D 1, + .pte_chunk_size_kbytes =3D 2, + .meta_chunk_size_kbytes =3D 2, + .writeback_chunk_size_kbytes =3D 2, + .line_buffer_size_bits =3D 589824, + .max_line_buffer_lines =3D 12, + .IsLineBufferBppFixed =3D 0, + .LineBufferFixedBpp =3D -1, + .writeback_luma_buffer_size_kbytes =3D 12, + .writeback_chroma_buffer_size_kbytes =3D 8, + .max_num_dpp =3D 4, + .max_num_wb =3D 2, + .max_dchub_pscl_bw_pix_per_clk =3D 4, + .max_pscl_lb_bw_pix_per_clk =3D 2, + .max_lb_vscl_bw_pix_per_clk =3D 4, + .max_vscl_hscl_bw_pix_per_clk =3D 4, + .max_hscl_ratio =3D 4, + .max_vscl_ratio =3D 4, + .hscl_mults =3D 4, + .vscl_mults =3D 4, + .max_hscl_taps =3D 8, + .max_vscl_taps =3D 8, + .dispclk_ramp_margin_percent =3D 1, + .underscan_factor =3D 1.10, + .min_vblank_lines =3D 14, + .dppclk_delay_subtotal =3D 90, + .dispclk_delay_subtotal =3D 42, + .dcfclk_cstate_latency =3D 10, + .max_inter_dcn_tile_repeaters =3D 8, + .can_vstartup_lines_exceed_vsync_plus_back_porch_lines_minus_one =3D 0, + .bug_forcing_LC_req_same_size_fixed =3D 0, +}; + +struct _vcs_dpi_soc_bounding_box_st dcn1_0_soc =3D { + .sr_exit_time_us =3D 9.0, + .sr_enter_plus_exit_time_us =3D 11.0, + .urgent_latency_us =3D 4.0, + .writeback_latency_us =3D 12.0, + .ideal_dram_bw_after_urgent_percent =3D 80.0, + .max_request_size_bytes =3D 256, + .downspread_percent =3D 0.5, + .dram_page_open_time_ns =3D 50.0, + .dram_rw_turnaround_time_ns =3D 17.5, + .dram_return_buffer_per_channel_bytes =3D 8192, + .round_trip_ping_latency_dcfclk_cycles =3D 128, + .urgent_out_of_order_return_per_channel_bytes =3D 256, + .channel_interleave_bytes =3D 256, + .num_banks =3D 8, + .num_chans =3D 2, + .vmm_page_size_bytes =3D 4096, + .dram_clock_change_latency_us =3D 17.0, + .writeback_dram_clock_change_latency_us =3D 23.0, + .return_bus_width_bytes =3D 64, +}; + diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h b/drivers= /gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h new file mode 100644 index 000000000000..e74ed4b4ce5b --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn10/dcn10_fpu.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright 2021 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software= "), + * to deal in the Software without restriction, including without limitati= on + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included= in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS= OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#ifndef __DCN10_FPU_H__ +#define __DCN10_FPU_H__ + +#endif /* __DCN20_FPU_H__ */ --=20 2.34.1