From nobody Sat Apr 27 18:01:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of groups.io designates 66.175.222.12 as permitted sender) client-ip=66.175.222.12; envelope-from=bounce+27952+66384+1787277+3901457@groups.io; helo=web01.groups.io; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of groups.io designates 66.175.222.12 as permitted sender) smtp.mailfrom=bounce+27952+66384+1787277+3901457@groups.io ARC-Seal: i=1; a=rsa-sha256; t=1603097726; cv=none; d=zohomail.com; s=zohoarc; b=FG6MA5It9/PlUjMKeF3BCi4Qg09+tonJWebzPxJAZO877Q2ASeGe1r42FGdqM7kF0VhWVHFXCtYoT+k38Tv9qOnED4FavJ9XRGeFPmx6hCKtW4VBOLQ9XYAfNTKsxH0todCnKfRZ8vzmvuuyfeV0Oo04xalWdxNXQLRQMcp57do= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1603097726; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Id:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:References:Sender:Subject:To; bh=HHUWcgqOgqjHtn/FDfbNUlPenNJH1G7HMUZtbgKSVrA=; b=YE4BSnpYZsP+zMnQaOCusLi9OOMT+Oay1aGH6BdeyxJ/c2afrHIklrRJ53qQpodryV8JeVWBQ1VYfLhpcQzpYppKRwTXUNrIJnpokxEqEJ5EQLSHNEy/j6c6N1E3UkW6kjgMhxB+5Ftof5Ba+QuCSzr7ZTZg4XrfD+0p7NAG2t8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of groups.io designates 66.175.222.12 as permitted sender) smtp.mailfrom=bounce+27952+66384+1787277+3901457@groups.io Received: from web01.groups.io (web01.groups.io [66.175.222.12]) by mx.zohomail.com with SMTPS id 1603097726010896.5391767009851; Mon, 19 Oct 2020 01:55:26 -0700 (PDT) Return-Path: X-Received: by 127.0.0.2 with SMTP id gCTNYY1788612xsYVXxk1tQL; Mon, 19 Oct 2020 01:55:25 -0700 X-Received: from mail.byosoft.com.cn (mail.byosoft.com.cn [58.240.74.242]) by mx.groups.io with SMTP id smtpd.web09.9604.1603097718508300281 for ; Mon, 19 Oct 2020 01:55:19 -0700 X-Received: from LAPTOP2AECFQIA ([58.246.60.130]) (envelope-sender ) by 192.168.6.13 with ESMTP for ; Mon, 19 Oct 2020 16:55:15 +0800 X-WM-Sender: fengyunhua@byosoft.com.cn X-WM-AuthFlag: YES X-WM-AuthUser: fengyunhua@byosoft.com.cn From: "fengyunhua" To: , Cc: "'Bob Feng'" , "'Liming Gao'" , "'Yuwei Chen'" References: <20201016074124.831-1-jian.j.wang@intel.com> In-Reply-To: <20201016074124.831-1-jian.j.wang@intel.com> Subject: =?UTF-8?B?5Zue5aSNOiBbZWRrMi1kZXZlbF0gW1BBVENIXSBCYXNlVG9vbHM6IGZpeCBkZWNvZGluZyBpc3N1ZSBpbiBmaWxlIG9wZXJhdGlvbg==?= Date: Mon, 19 Oct 2020 16:55:15 +0800 Message-ID: <000001d6a5f5$90086c50$b01944f0$@byosoft.com.cn> MIME-Version: 1.0 Thread-Index: AQF5cbjZQyuTn0qE6hqlr6YwESuFG6pZBygQ Precedence: Bulk List-Unsubscribe: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,fengyunhua@byosoft.com.cn X-Gm-Message-State: Oj4xNbNgSmbJbuHdJ8MQYemdx1787277AA= Content-Transfer-Encoding: quoted-printable Content-Language: zh-cn DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=groups.io; q=dns/txt; s=20140610; t=1603097725; bh=KZyzgLV62FfegHTuVXYSJ+ts5j7IZ8XqvnYB0dsGAYQ=; h=Cc:Content-Type:Date:From:Reply-To:Subject:To; b=wrdPRItPxKy3AWu6MwHt+FF8xEPt17cubLq0X0P4uXkII8enLA/L6Pt5ME5s2to4/Sy lCzq7PhjnXI15luvlb/gNicPz3Oy3INO9qSeZqBq6upLOED/I74Lsw11G81JrcwVLXOST vis/QoHCqSShXtiQ2YAelu9Ebf5oTme5/ds= X-ZohoMail-DKIM: pass (identity @groups.io) Content-Type: text/plain; charset="utf-8" Tested-by: Yunhua Feng -----=E9=82=AE=E4=BB=B6=E5=8E=9F=E4=BB=B6----- =E5=8F=91=E4=BB=B6=E4=BA=BA: bounce+27952+66316+5049190+8953120@groups.io <= bounce+27952+66316+5049190+8953120@groups.io> =E4=BB=A3=E8=A1=A8 Wang, Jian= J =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: 2020=E5=B9=B410=E6=9C=8816=E6=97=A5 1= 5:41 =E6=94=B6=E4=BB=B6=E4=BA=BA: devel@edk2.groups.io =E6=8A=84=E9=80=81: Bob Feng ; Liming Gao ; Yuwei Chen =E4=B8=BB=E9=A2=98: [edk2-devel] [PATCH] BaseTools: fix decoding issue in f= ile operation The build tool reports failure upon file read, such as calling trim to clean preprocessed source files, if the tool is running on OS with non-western code-page and the source file has non-ascii characters. Even if utf-8 has also problem when encountering some characters encoded in cp1252 (such 0x92, 0x96, 0xa0, etc). Currently, the safest way to read file in python code is using 'latin-1' (iso-8859-1) because it uses every byte between 00-FF and then won't cause encoding/decoding issue. It behaves almost the same as reading file in binary mode. cp1252 is similar to latin-1 but it doesn't support encoding '\x80' to '\xff' and doesn't support decoding following bytes: '\x81', '\x8d', '\x8f', '\x90', '\x9d' So if there're utf-8/16 encoded characters in file, it will fail sometimes. Refer to following links for details: https://en.wikipedia.org/wiki/Latin-1_Supplement_(Unicode_block) https://en.wikipedia.org/wiki/Windows-1252 https://kb.iu.edu/d/aepu https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html One can use following python code to verify this. for i in range(0x100): try: chr(i).encode('latin-1') except: print(" %s cannot encode %02x" % ('latin-1', i)) for i in range(0x100): try: b =3D bytes([i]) b.decode('latin-1') except: print(" %s cannot decode %02x" % ('latin-1', i)) This patch add code to enforce using 'latin-1' as encoding argument of open() in function OpenLongFilePath(), if the open mode is for text file only. This can solve the file decoding issue completely. The possible related BZs: https://bugzilla.tianocore.org/show_bug.cgi?id=3D1434 https://bugzilla.tianocore.org/show_bug.cgi?id=3D1637 https://bugzilla.tianocore.org/show_bug.cgi?id=3D2578 https://bugzilla.tianocore.org/show_bug.cgi?id=3D2709 https://bugzilla.tianocore.org/show_bug.cgi?id=3D2829 Cc: Bob Feng Cc: Liming Gao Cc: Yuwei Chen Signed-off-by: Jian J Wang --- BaseTools/Source/Python/Common/LongFilePathSupport.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/BaseTools/Source/Python/Common/LongFilePathSupport.py b/BaseTo= ols/Source/Python/Common/LongFilePathSupport.py index 38c4396544..c8dce077f2 100644 --- a/BaseTools/Source/Python/Common/LongFilePathSupport.py +++ b/BaseTools/Source/Python/Common/LongFilePathSupport.py @@ -30,7 +30,8 @@ def LongFilePath(FileName): # wrap open to support opening a long file path # def OpenLongFilePath(FileName, Mode=3D'r', Buffer=3D -1): - return open(LongFilePath(FileName), Mode, Buffer) + Encoding =3D None if 'b' in Mode else 'latin-1' + return open(LongFilePath(FileName), Mode, Buffer, Encoding) =20 def CodecOpenLongFilePath(Filename, Mode=3D'rb', Encoding=3DNone, Errors= =3D'strict', Buffering=3D1): return codecs.open(LongFilePath(Filename), Mode, Encoding, Errors, Buf= fering) --=20 2.24.0.windows.2 -=3D-=3D-=3D-=3D-=3D-=3D Groups.io Links: You receive all messages sent to this group. View/Reply Online (#66316): https://edk2.groups.io/g/devel/message/66316 Mute This Topic: https://groups.io/mt/77546105/5049190 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [fengyunhua@byosoft.com.c= n] -=3D-=3D-=3D-=3D-=3D-=3D -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#66384): https://edk2.groups.io/g/devel/message/66384 Mute This Topic: https://groups.io/mt/77654194/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-