From nobody Sat Oct 4 05:01:29 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F423F2E040E; Wed, 20 Aug 2025 09:10:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755681014; cv=none; b=HM85wnLuAd7gEzhhEfkRTewIgSstNpWsWvHQ31xlfkMtmuPNLHnW+AyXT3xZOWcZ4YyKSGqSuyyR861fAfr1DQthTJ8lqSUCIVA7nu8qE8CPMMhri/jYTrSLIhFlRX40M8C8seslWhPQABdh0G2MDXXVW4htbm9iqPrcBRhGR+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755681014; c=relaxed/simple; bh=9wvul6/w8xzCWUv6hCt3n+RtNBa+Uwb8WNag5YjgMnw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=kh86t8N4e1yuvsz1luoBdnoroSDuplkPEO5MWcToFFlWMsjI+ZO6W61l/vynOykYrWeJiVPBrRLAB9U52xzIecgde+DYbUOG2nbqRgGYI1XVYhRvkWd3hT9Va9eZGYyv0hhotpHg1Ra61ulvm9/EDybzpAVM9WxRE1czOyEbaEc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=DLHDJgkL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="DLHDJgkL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6F4EEC4CEEB; Wed, 20 Aug 2025 09:10:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1755681013; bh=9wvul6/w8xzCWUv6hCt3n+RtNBa+Uwb8WNag5YjgMnw=; h=From:To:Cc:Subject:Date:From; b=DLHDJgkLx3/ZPCRJzCXeO9w1mpOJkYa7IES6qMtAdz+/XKsrTcrNO5Ppt9Qzk75dK ufqJaN9RjEcuk3HV9c8bKD46xOKAFJgZZ09OCJlAoASjWQaHmLA30nQufQ3M+JH8RT 62S6jcWFArcLN+hYfhYkfVHI4vojPS8hrC12eYkwA7vlXtlK2ls2zc5+F+gP8aTEaU 3na0DibnW+4iz8bGmQt9FEBsn014wU+Uq4MadLrpprLdK62Umf0ig9D/02ZMNx3DbY 3wvnkFUg7nwbGHpkTKVK7rbUBVFOcvls5gSTNQNC9obaWJgK2A9UH9I3q9Vf9Hdn/s 4ytZgCSiLIFKg== Received: from mchehab by mail.kernel.org with local (Exim 4.98.2) (envelope-from ) id 1uoepf-0000000ASew-1IU5; Wed, 20 Aug 2025 11:10:11 +0200 From: Mauro Carvalho Chehab To: Jonathan Corbet , Linux Doc Mailing List Cc: Mauro Carvalho Chehab , "Mauro Carvalho Chehab" , Kees Cook , linux-kernel@vger.kernel.org Subject: [PATCH] docs: kfigure.py: don't crash during read/write Date: Wed, 20 Aug 2025 11:09:59 +0200 Message-ID: X-Mailer: git-send-email 2.50.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: Mauro Carvalho Chehab Content-Type: text/plain; charset="utf-8" By default, Python does a very bad job when reading/writing from files, as it tries to enforce that the character is < 128. Nothing prevents a SVG file to contain, for instance, a comment with an utf-8 accented copyright notice - or even an utf-8 invalid char. While testing PDF and html builds, I recently faced one build that got an error at kfigure.py saying that a char was > 128, crashing PDF output. To avoid such issues, let's use PEP 383 subrogate escape encoding to prevent read/write errors on such cases. Signed-off-by: Mauro Carvalho Chehab --- Documentation/sphinx/kfigure.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Documentation/sphinx/kfigure.py b/Documentation/sphinx/kfigure= .py index ad495c0da270..8ba07344a1c8 100644 --- a/Documentation/sphinx/kfigure.py +++ b/Documentation/sphinx/kfigure.py @@ -88,7 +88,7 @@ def mkdir(folder, mode=3D0o775): os.makedirs(folder, mode) =20 def file2literal(fname): - with open(fname, "r") as src: + with open(fname, "r", encoding=3D'utf8', errors=3D'surrogateescape') a= s src: data =3D src.read() node =3D nodes.literal_block(data, data) return node @@ -355,7 +355,7 @@ def dot2format(app, dot_fname, out_fname): cmd =3D [dot_cmd, '-T%s' % out_format, dot_fname] exit_code =3D 42 =20 - with open(out_fname, "w") as out: + with open(out_fname, "w", encoding=3D'utf8', errors=3D'surrogateescape= ') as out: exit_code =3D subprocess.call(cmd, stdout =3D out) if exit_code !=3D 0: logger.warning( @@ -533,7 +533,7 @@ def visit_kernel_render(self, node): literal_block =3D node[0] =20 code =3D literal_block.astext() - hashobj =3D code.encode('utf-8') # str(node.attributes) + hashobj =3D code.encode('utf-8', errors=3D'surrogateescape')) # str= (node.attributes) fname =3D path.join('%s-%s' % (srclang, sha1(hashobj).hexdigest())) =20 tmp_fname =3D path.join( @@ -541,7 +541,7 @@ def visit_kernel_render(self, node): =20 if not path.isfile(tmp_fname): mkdir(path.dirname(tmp_fname)) - with open(tmp_fname, "w") as out: + with open(tmp_fname, "w", encoding=3D'utf8', errors=3D'surrogatees= cape') as out: out.write(code) =20 img_node =3D nodes.image(node.rawsource, **node.attributes) --=20 2.50.1