From nobody Mon Feb 9 01:55:09 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1728395243436879.0528559837118; Tue, 8 Oct 2024 06:47:23 -0700 (PDT) Received: by lists.libvirt.org (Postfix, from userid 996) id 53CA5A97; Tue, 8 Oct 2024 09:47:22 -0400 (EDT) Received: from lists.libvirt.org (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id 06E001669; Tue, 8 Oct 2024 09:44:30 -0400 (EDT) Received: by lists.libvirt.org (Postfix, from userid 996) id AFA3B1621; Tue, 8 Oct 2024 09:44:24 -0400 (EDT) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 0B8211621 for ; Tue, 8 Oct 2024 09:44:13 -0400 (EDT) Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-120-kEfZPEziMdyJgEky_5juaQ-1; Tue, 08 Oct 2024 09:44:11 -0400 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6E46A1955D92 for ; Tue, 8 Oct 2024 13:44:10 +0000 (UTC) Received: from speedmetal.redhat.com (unknown [10.45.242.12]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A3D8A19560AA for ; Tue, 8 Oct 2024 13:44:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1728395052; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sd15rMsHPMALyZKIxQzvtHOG8PmjuCV882NlJn0owwE=; b=EbSG4isd0FO+dFByDAca/X5oGnpxC3iO4xVDIXIuEVEfJzC5JmS6PBQKWUuMO9xSw2GTXg ppIydCiBYJYZPfIcYiN8ptyQeX2m7Xh6eGBKFQ/1lEOLcrE+GMET6gnxGUzgb88z3lBBgN 8tq+s60zNUijXSBL9x7AVj/3aguJXQM= X-MC-Unique: kEfZPEziMdyJgEky_5juaQ-1 From: Peter Krempa To: devel@lists.libvirt.org Subject: [PATCH 6/6] docs: Prohibit 'external' links within the webpage Date: Tue, 8 Oct 2024 15:43:59 +0200 Message-ID: <863f62b3af11ca8f4a5bb0ed313a26658d59f531.1728394993.git.pkrempa@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Message-ID-Hash: UVALUJSUEBAGZPN7NU3JWNG6QOMQXRNO X-Message-ID-Hash: UVALUJSUEBAGZPN7NU3JWNG6QOMQXRNO X-MailFrom: pkrempa@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1728395245645116600 Content-Type: text/plain; charset="utf-8" Enforce that relative links are used within the page, so that local installations don't require internet conection and/or don't redirect to the web needlessly. This is done by looking for any local link (barring exceptions) when checking links with 'check-html-references.py'. Signed-off-by: Peter Krempa --- docs/meson.build | 3 +++ scripts/check-html-references.py | 46 +++++++++++++++++++++++++++----- 2 files changed, 43 insertions(+), 6 deletions(-) diff --git a/docs/meson.build b/docs/meson.build index a94f481730..d7343b6665 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -359,6 +359,9 @@ if tests_enabled[0] args: [ check_html_references_prog.full_path(), '--require-https', + '--project-uri', 'https://libvirt.org', + '--project-uri-exceptions', 'docs/manpages/', + '--project-uri-exceptions', 'docs/html/', '--webroot', meson.project_build_root() / 'docs' ], diff --git a/scripts/check-html-references.py b/scripts/check-html-referenc= es.py index 3382d838c5..74299a1958 100755 --- a/scripts/check-html-references.py +++ b/scripts/check-html-references.py @@ -53,7 +53,7 @@ def get_file_list(prefix): # loads an XHTML and extracts all anchors, local and remote links for the = one file -def process_file(filename): +def process_file(filename, project_uri): tree =3D ET.parse(filename) root =3D tree.getroot() docname =3D root.get('data-sourcedoc') @@ -65,6 +65,7 @@ def process_file(filename): anchors =3D [filename] targets =3D [] images =3D [] + projectlinks =3D [] for elem in root.findall('.//html:a', ns): target =3D elem.get('href') @@ -76,6 +77,10 @@ def process_file(filename): if target: if re.search('://', target): externallinks.append(target) + + if project_uri is not None and target.startswith(project_u= ri): + projectlinks.append((target, docname)) + elif target[0] !=3D '#' and 'mailto:' not in target: targetfull =3D os.path.normpath(os.path.join(dirname, targ= et)) @@ -106,22 +111,24 @@ def process_file(filename): imagefull =3D os.path.normpath(os.path.join(dirname, src)) images.append((imagefull, docname)) - return (anchors, targets, images) + return (anchors, targets, images, projectlinks) -def process_all(filelist): +def process_all(filelist, project_uri): anchors =3D [] targets =3D [] images =3D [] + projectlinks =3D [] for file in filelist: - anchor, target, image =3D process_file(file) + anchor, target, image, projectlink =3D process_file(file, project_= uri) targets =3D targets + target anchors =3D anchors + anchor images =3D images + image + projectlinks =3D projectlinks + projectlink - return (targets, anchors, images) + return (targets, anchors, images, projectlinks) def check_targets(targets, anchors): @@ -236,6 +243,26 @@ def check_https(links): return fail +# checks prohibited external links to local files +def check_projectlinks(projectlinks, exceptions): + fail =3D False + + for (link, filename) in projectlinks: + allowed =3D False + + if exceptions is not None: + for exc in exceptions: + if exc in filename: + allowed =3D True + break + + if not allowed: + print(f'ERROR: prohibited external URI \'{link}\' to local pro= ject in \'{filename}\'') + fail =3D True + + return fail + + parser =3D argparse.ArgumentParser(description=3D'HTML reference checker') parser.add_argument('--webroot', required=3DTrue, help=3D'path to the web root') @@ -247,6 +274,10 @@ parser.add_argument('--ignore-images', action=3D'appen= d', help=3D'paths to images that should be considered as u= sed') parser.add_argument('--require-https', action=3D"store_true", help=3D'require secure https for external links') +parser.add_argument('--project-uri', + help=3D'external prefix of the local project (e.g. htt= ps://libvirt.org; external links with that prefix are prohibited') +parser.add_argument('--project-uri-exceptions', action=3D'append', + help=3D'list of path prefixes excluded from the "--pro= ject-uri" checks') args =3D parser.parse_args() @@ -254,7 +285,7 @@ files, imagefiles =3D get_file_list(os.path.abspath(arg= s.webroot)) entrypoint =3D os.path.join(os.path.abspath(args.webroot), args.entrypoint) -targets, anchors, usedimages =3D process_all(files) +targets, anchors, usedimages, projectlinks =3D process_all(files, args.pro= ject_uri) fail =3D False @@ -283,6 +314,9 @@ else: if check_images(usedimages, imagefiles, args.ignore_images): fail =3D True + if check_projectlinks(projectlinks, args.project_uri_exceptions): + fail =3D True + if args.require_https: if check_https(externallinks): fail =3D True --=20 2.46.0