From nobody Mon Apr 29 15:23:09 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of groups.io designates 66.175.222.12 as permitted sender) client-ip=66.175.222.12; envelope-from=bounce+27952+40466+1787277+3901457@groups.io; helo=web01.groups.io; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zoho.com: domain of groups.io designates 66.175.222.12 as permitted sender) smtp.mailfrom=bounce+27952+40466+1787277+3901457@groups.io ARC-Seal: i=1; a=rsa-sha256; t=1557548646; cv=none; d=zoho.com; s=zohoarc; b=B4910YGdP1VeRPBNjd9Sa+mSlJvxGe33MMxD1UomQqUWl/9XdYqjC1mATvqcYa/9GHfKScQsr3EY8U/tmv50JLRopDPZBcKBQSMAfcv6zz1lEXKrjyCd6/0JSFFSZ5Dk7wHg0aHzsHpNNRKO2zmR9ZXRZ/1OR7HOFwV0aN/bm4M= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1557548646; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Id:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Sender:Subject:To:ARC-Authentication-Results; bh=rlBC4aP8wgjoBa3jH+O2Ec/dDyrAmsWsBDRbfRt/fWA=; b=MPCT5p47I1DB5mtrBKV+rOhbAEEsMu+Amqn0z1VwQx5fIr2zu0RvTmA7XXqFjIhEcQyEomoalxJkNOKBpm74ZTBvwVTBS/k0G3HdTdKYj8/Ge6XzKFWjv6uAtPAKQGk8OeToQW344INfknsqhdKkMYSvAqytEpb+LV1JxZTtoaE= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass; spf=pass (zoho.com: domain of groups.io designates 66.175.222.12 as permitted sender) smtp.mailfrom=bounce+27952+40466+1787277+3901457@groups.io Received: from web01.groups.io (web01.groups.io [66.175.222.12]) by mx.zohomail.com with SMTPS id 1557548646645138.4992942826774; Fri, 10 May 2019 21:24:06 -0700 (PDT) Return-Path: X-Received: from mail-vk1-f202.google.com (mail-vk1-f202.google.com [209.85.221.202]) by groups.io with SMTP; Fri, 10 May 2019 21:24:05 -0700 X-Received: by mail-vk1-f202.google.com with SMTP id s139so3409173vkf.2 for ; Fri, 10 May 2019 21:24:04 -0700 (PDT) X-Gm-Message-State: APjAAAW8ogOux8A6pXuPxpqJJLEwbQhZiF0/XO761s0jO5IYKbuGO9IO ZO2XmEBaUUWlA1P+LYdIa5qmfM0c1JXEK2UgpwxVwrYaygG5/AuJmOTL1v+JUBWiqAyko3zrk4G u6+QdIrvJd40osYkXClOgPOM/nfhREUROwKId23rd1ZmvH7zGUCng88jJfDi1MXOg8X4= X-Google-Smtp-Source: APXvYqyE8pBCMBWXu6ga+kV8gD1LukC0m4AsYhwjE2X28wIW+RZFzpZY+bAqzsnQwWZq8wqGhnjNucMWboLdNQg= X-Received: by 2002:a1f:a410:: with SMTP id n16mr7141770vke.73.1557548643787; Fri, 10 May 2019 21:24:03 -0700 (PDT) Date: Fri, 10 May 2019 21:24:01 -0700 Message-Id: <20190511042401.115133-1-joerichey@google.com> Mime-Version: 1.0 Subject: [edk2-devel] [PATCH] BaseTools: VfrCompile/Pccts: Fix invalid bytes From: "Joe Richey via Groups.Io" To: devel@edk2.groups.io Cc: Bob Feng , Liming Gao , Yonghong Zhu Precedence: Bulk List-Unsubscribe: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Reply-To: devel@edk2.groups.io,joerichey@google.com Content-Transfer-Encoding: quoted-printable DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=groups.io; q=dns/txt; s=20140610; t=1557548645; bh=LbA3aVYqpkswdxmJZUkTbC8NWSn3yk1jdf3RqWbGE5o=; h=Cc:Content-Type:Date:From:Reply-To:Subject:To; b=AwMUFUaU+g3xkvQdI3tVBwdgKoJ7AmRC5KNLyQqcS8g/JCpuP4wj8NPg0qeAWYJ/CLm 6VycWrqPIJyBhCjLKlM0bQ8fjXbwuWc+FI+DL9yqV69BDooqB0DQDccmZARfwM4pOzAzg jEgIkg+01oYNbSXlblrII3oJitAsgv7LD+M= X-ZohoMail-DKIM: pass (identity @groups.io) Content-Type: text/plain; charset="utf-8" Three text files have invalid ASCII bytes, this can mess up tooling that trys to operate on the repository, which will accidentally classify them as binary data. https://github.com/josephlr/edk2/tree/format Cc: Bob Feng Cc: Liming Gao Cc: Yonghong Zhu Signed-off-by: Joe Richey --- BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt | 2 +- BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt | 78 ++++++++++----= ------ BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt | 6 +- 3 files changed, 43 insertions(+), 43 deletions(-) diff --git a/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt b/BaseT= ools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt index 539cf775257b..f073e620ab68 100644 --- a/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt +++ b/BaseTools/Source/C/VfrCompile/Pccts/KNOWN_PROBLEMS.txt @@ -40,7 +40,7 @@ An bug (or at least an oddity) is that a reference to LT(1), LA(1), or LATEXT(1) in an action which immediately follows a token match in a rule refers to the token matched, not the token which is in - the lookahead buffer. Consider:=13 + the lookahead buffer. Consider: =20 r : abc <> D <> E; =20 diff --git a/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt b/BaseToo= ls/Source/C/VfrCompile/Pccts/antlr/antlr1.txt index 4a7d22e7f239..140b064217b7 100644 --- a/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt +++ b/BaseTools/Source/C/VfrCompile/Pccts/antlr/antlr1.txt @@ -9,48 +9,48 @@ NAME antlr - ANother Tool for Language Recognition =20 SYNTAX - antlr [_=08o_=08p_=08t_=08i_=08o_=08n_=08s] _=08g_=08r_=08a_=08m_=08m= _=08a_=08r__=08f_=08i_=08l_=08e_=08s + antlr [options] grammar_files =20 DESCRIPTION - _=08A_=08n_=08t_=08l_=08r converts an extended form of context-free g= rammar into + Antlr converts an extended form of context-free grammar into a set of C functions which directly implement an efficient form of deterministic recursive-descent LL(k) parser. Context-free grammars may be augmented with predicates to allow semantics to influence parsing; this allows a form of context-sensitive parsing. Selective backtracking is also available to handle non-LL(k) and even non-LALR(k) con- - structs. _=08A_=08n_=08t_=08l_=08r also produces a definition of a l= exer which + structs. Antlr also produces a definition of a lexer which can be automatically converted into C code for a DFA-based - lexer by _=08d_=08l_=08g. Hence, _=08a_=08n_=08t_=08l_=08r serves a = function much like that - of _=08y_=08a_=08c_=08c, however, it is notably more flexible and is = more - integrated with a lexer generator (_=08a_=08n_=08t_=08l_=08r directly= generates - _=08d_=08l_=08g code, whereas _=08y_=08a_=08c_=08c and _=08l_=08e_=08= x are given independent - descriptions). Unlike _=08y_=08a_=08c_=08c which accepts LALR(1) gra= mmars, - _=08a_=08n_=08t_=08l_=08r accepts LL(k) grammars in an extended BNF n= otation - + lexer by dlg. Hence, antlr serves a function much like that + of yacc, however, it is notably more flexible and is more + integrated with a lexer generator (antlr directly generates + dlg code, whereas yacc and lex are given independent + descriptions). Unlike yacc which accepts LALR(1) grammars, + antlr accepts LL(k) grammars in an extended BNF notation - which eliminates the need for precedence rules. =20 - Like _=08y_=08a_=08c_=08c grammars, _=08a_=08n_=08t_=08l_=08r grammar= s can use automatically- + Like yacc grammars, antlr grammars can use automatically- maintained symbol attribute values referenced as dollar - variables. Further, because _=08a_=08n_=08t_=08l_=08r generates top-= down + variables. Further, because antlr generates top-down parsers, arbitrary values may be inherited from parent rules - (passed like function parameters). _=08A_=08n_=08t_=08l_=08r also ha= s a mechan- + (passed like function parameters). Antlr also has a mechan- ism for creating and manipulating abstract-syntax-trees. =20 - There are various other niceties in _=08a_=08n_=08t_=08l_=08r, includ= ing the + There are various other niceties in antlr, including the ability to spread one grammar over multiple files or even multiple grammars in a single file, the ability to generate a version of the grammar with actions stripped out (for documentation purposes), and lots more. =20 OPTIONS - -ck _=08n - Use up to _=08n symbols of lookahead when using compressed + -ck n + Use up to n symbols of lookahead when using compressed (linear approximation) lookahead. This type of looka- head is very cheap to compute and is attempted before full LL(k) lookahead, which is of exponential complex- ity in the worst case. In general, the compressed loo- - kahead can be much deeper (e.g, -ck 10) _=08t_=08h_=08a_=08n _= =08t_=08h_=08e _=08f_=08u_=08l_=08l - _=08l_=08o_=08o_=08k_=08a_=08h_=08e_=08a_=08d (_=08w_=08h_=08i_= =08c_=08h _=08u_=08s_=08u_=08a_=08l_=08l_=08y _=08m_=08u_=08s_=08t _=08b_= =08e _=08l_=08e_=08s_=08s _=08t_=08h_=08a_=08n _=084). + kahead can be much deeper (e.g, -ck 10) than the full + lookahead (which usually must be less than 4). =20 -CC Generate C++ output from both ANTLR and DLG. =20 @@ -86,20 +86,20 @@ OPTIONS =20 -ga Generate ANSI-compatible code (default case). This has not been rigorously tested to be ANSI XJ11 C compliant, - but it is close. The normal output of _=08a_=08n_=08t_=08l_=08r= is + but it is close. The normal output of antlr is currently compilable under both K&R, ANSI C, and C++- - this option does nothing because _=08a_=08n_=08t_=08l_=08r gener= ates a + this option does nothing because antlr generates a bunch of #ifdef's to do the right thing depending on the language. =20 - -gc Indicates that _=08a_=08n_=08t_=08l_=08r should generate no C co= de, i.e., + -gc Indicates that antlr should generate no C code, i.e., only perform analysis on the grammar. =20 - -gd C code is inserted in each of the _=08a_=08n_=08t_=08l_=08r gene= rated pars- + -gd C code is inserted in each of the antlr generated pars- ing functions to provide for user-defined handling of a detailed parse trace. The inserted code consists of calls to the user-supplied macros or functions called - zzTRACEIN and zzTRACEOUT. The only argument is a _=08c_=08h_=08= a_=08r + zzTRACEIN and zzTRACEOUT. The only argument is a char * pointing to a C-style string which is the grammar rule recognized by the current parsing function. If no definition is given for the trace functions, upon rule @@ -110,17 +110,17 @@ OPTIONS =20 -gh Generate stdpccts.h for non-ANTLR-generated files to include. This file contains all defines needed to - describe the type of parser generated by _=08a_=08n_=08t_=08l_= =08r (e.g. + describe the type of parser generated by antlr (e.g. how much lookahead is used and whether or not trees are constructed) and contains the header action specified by the user. =20 -gk Generate parsers that delay lookahead fetches until - needed. Without this option, _=08a_=08n_=08t_=08l_=08r generate= s parsers - which always have _=08k tokens of lookahead available. + needed. Without this option, antlr generates parsers + which always have k tokens of lookahead available. =20 -gl Generate line info about grammar actions in C parser of - the form # _=08l_=08i_=08n_=08e "_=08f_=08i_=08l_=08e" which mak= es error messages from + the form # line "file" which makes error messages from the C/C++ compiler make more sense as they will point into the grammar file not the resulting C file. Debugging is easier as well, because you will step @@ -128,18 +128,18 @@ OPTIONS =20 -gs Do not generate sets for token expression lists; instead generate a ||-separated sequence of - LA(1)=3D=3D_=08t_=08o_=08k_=08e_=08n__=08n_=08u_=08m_=08b_=08e_= =08r. The default is to generate sets. + LA(1)=3D=3Dtoken_number. The default is to generate sets. =20 -gt Generate code for Abstract-Syntax Trees. =20 -gx Do not create the lexical analyzer files (dlg-related). This option should be given when the user wishes to provide a customized lexical analyzer. It may also be - used in _=08m_=08a_=08k_=08e scripts to cause only the parser to= be + used in make scripts to cause only the parser to be rebuilt when a change not affecting the lexical struc- ture is made to the input grammars. =20 - -k _=08n Set k of LL(k) to _=08n; i.e. set tokens of look-ahead + -k n Set k of LL(k) to n; i.e. set tokens of look-ahead (default=3D=3D1). =20 -o dir @@ -171,9 +171,9 @@ OPTIONS release with option -pr on. Context computation is off by default. =20 - -rl _=08n + -rl n Limit the maximum number of tree nodes used by grammar - analysis to _=08n. Occasionally, _=08a_=08n_=08t_=08l_=08r is u= nable to + analysis to n. Occasionally, antlr is unable to analyze a grammar submitted by the user. This rare situation can only occur when the grammar is large and the amount of lookahead is greater than one. A non- @@ -184,14 +184,14 @@ OPTIONS the number of calls to the full LL(k) algorithm. An error message will be displayed, if this limit is reached, which indicates the grammar construct being - analyzed when _=08a_=08n_=08t_=08l_=08r hit a non-linearity. Us= e this - option if _=08a_=08n_=08t_=08l_=08r seems to go out to lunch and= your disk - start thrashing; try _=08n=3D10000 to start. Once the + analyzed when antlr hit a non-linearity. Use this + option if antlr seems to go out to lunch and your disk + start thrashing; try n=3D10000 to start. Once the offending construct has been identified, try to remove - the ambiguity that _=08a_=08n_=08t_=08l_=08r was trying to overc= ome with + the ambiguity that antlr was trying to overcome with large lookahead analysis. The introduction of (...)? backtracking blocks eliminates some of these problems - - _=08a_=08n_=08t_=08l_=08r does not analyze alternatives that beg= in with + antlr does not analyze alternatives that begin with (...)? (it simply backtracks, if necessary, at run time). =20 @@ -208,7 +208,7 @@ OPTIONS as the parser file. =20 SPECIAL CONSIDERATIONS - _=08A_=08n_=08t_=08l_=08r works... we think. There is no implicit g= uarantee of + Antlr works... we think. There is no implicit guarantee of anything. We reserve no legal rights to the software known as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS is in the public domain. An individual or company may do @@ -234,7 +234,7 @@ FILES output C++ parser when C++ mode is used. =20 parser.dlg - output _=08d_=08l_=08g lexical analyzer. + output dlg lexical analyzer. =20 err.c token string array, error sets and error support rou- @@ -251,7 +251,7 @@ FILES erated by default. Not used in C++ mode. =20 tokens.h - output #_=08d_=08e_=08f_=08i_=08n_=08e_=08s for tokens used and = function prototypes + output #defines for tokens used and function prototypes for functions generated for rules. =20 =20 diff --git a/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt b/BaseTools/S= ource/C/VfrCompile/Pccts/dlg/dlg1.txt index 06b320de2abb..5ea5e933c808 100644 --- a/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt +++ b/BaseTools/Source/C/VfrCompile/Pccts/dlg/dlg1.txt @@ -9,14 +9,14 @@ NAME dlg - DFA Lexical Analyzer Generator =20 SYNTAX - dlg [_=08o_=08p_=08t_=08i_=08o_=08n_=08s] _=08l_=08e_=08x_=08i_=08c_= =08a_=08l__=08s_=08p_=08e_=08c [_=08o_=08u_=08t_=08p_=08u_=08t__=08f_=08i_= =08l_=08e] + dlg [options] lexical_spec [output_file] =20 DESCRIPTION dlg is a tool that produces fast deterministic finite auto- mata for recognizing regular expressions in input. =20 OPTIONS - -CC Generate C++ output. The _=08o_=08u_=08t_=08p_=08u_=08t__=08f_= =08i_=08l_=08e is not specified + -CC Generate C++ output. The output_file is not specified in this case. =20 -C[ level] @@ -69,7 +69,7 @@ OPTIONS in or send output to standard out. =20 SPECIAL CONSIDERATIONS - _=08D_=08l_=08g works... we think. There is no implicit guarantee of + Dlg works... we think. There is no implicit guarantee of anything. We reserve no legal rights to the software known as the Purdue Compiler Construction Tool Set (PCCTS) - PCCTS is in the public domain. An individual or company may do --=20 2.21.0.1020.gf2820cf01a-goog -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#40466): https://edk2.groups.io/g/devel/message/40466 Mute This Topic: https://groups.io/mt/31584915/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-