1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
|
extern const char *help;
const char *help =
" Show help message.\n"
"\n"
" -1 --single-pass\n"
" Deprecated. Does nothing (single pass is the default now).\n"
"\n"
" -8 --utf-8\n"
" Generate a lexer that reads input in UTF-8 encoding. re2c assumes that character range is 0 -- 0x10FFFF and character size is 1 byte.\n"
"\n"
" -b --bit-vectors\n"
" Optimize conditional jumps using bit masks. Implies -s.\n"
"\n"
" -c --conditions --start-conditions\n"
" Enable support of Flex-like \"conditions\": multiple interrelated lexers within one block. Option --start-conditions is a legacy alias; use --conditions instead.\n"
"\n"
" --case-insensitive\n"
" Treat single-quoted and double-quoted strings as case-insensitive.\n"
"\n"
" --case-inverted\n"
" Invert the meaning of single-quoted and double-quoted strings: treat single-quoted strings as case-sensitive and double-quoted strings as case-insensitive.\n"
"\n"
" -D --emit-dot\n"
" Instead of normal output generate lexer graph in .dot format. The output can be converted to an image with the help of Graphviz (e.g. something like dot -Tpng -odfa.png dfa.dot).\n"
"\n"
" -d --debug-output\n"
" Emit YYDEBUG in the generated code. YYDEBUG should be defined by the user in the form of a void function with two parameters: state (lexer state or -1) and symbol (current input symbol of type YYCTYPE).\n"
"\n"
" --dfa-minimization <moore | table>\n"
" The internal algorithm used by re2c to minimize the DFA: moore (the default) is Moore algorithm, and table is the \"table filling\" algorithm. Both algorithms should produce the same DFA up to states relabeling; table\n"
" filling is simpler and much slower and serves as a reference implementation.\n"
"\n"
" --dump-adfa\n"
" Debug option: output DFA after tunneling (in .dot format).\n"
"\n"
" --dump-cfg\n"
" Debug option: output control flow graph of tag variables (in .dot format).\n"
"\n"
" --dump-closure-stats\n"
" Debug option: output statistics on the number of states in closure.\n"
"\n"
" --dump-dfa-det\n"
" Debug option: output DFA immediately after determinization (in .dot format).\n"
"\n"
" --dump-dfa-min\n"
" Debug option: output DFA after minimization (in .dot format).\n"
"\n"
" --dump-dfa-tagopt\n"
" Debug option: output DFA after tag optimizations (in .dot format).\n"
"\n"
" --dump-dfa-raw\n"
" Debug option: output DFA under construction with expanded state-sets (in .dot format).\n"
"\n"
" --dump-interf\n"
" Debug option: output interference table produced by liveness analysis of tag variables.\n"
"\n"
" --dump-nfa\n"
" Debug option: output NFA (in .dot format).\n"
"\n"
" -e --ecb\n"
" Generate a lexer that reads input in EBCDIC encoding. re2c assumes that character range is 0 -- 0xFF an character size is 1 byte.\n"
"\n"
" --eager-skip\n"
" Make the generated lexer advance the input position \"eagerly\": immediately after reading input symbol. By default this happens after transition to the next state. Implied by --no-lookahead.\n"
"\n"
" --empty-class <match-empty | match-none | error>\n"
" Define the way re2c treats empty character classes. With match-empty (the default) empty class matches empty input (which is illogical, but backwards-compatible). With``match-none`` empty class always fails to match.\n"
" With error empty class raises a compilation error.\n"
"\n"
" --encoding-policy <fail | substitute | ignore>\n"
" Define the way re2c treats Unicode surrogates. With fail re2c aborts with an error when a surrogate is encountered. With substitute re2c silently replaces surrogates with the error code point 0xFFFD. With ignore (the\n"
" default) re2c treats surrogates as normal code points. The Unicode standard says that standalone surrogates are invalid, but real-world libraries and programs behave in different ways.\n"
"\n"
" -f --storable-state\n"
" Generate a lexer which can store its inner state. This is useful in push-model lexers which are stopped by an outer program when there is not enough input, and then resumed when more input becomes available. In this\n"
" mode users should additionally define YYGETSTATE() and YYSETSTATE(state) macros and variables yych, yyaccept and state as part of the lexer state.\n"
"\n"
" -F --flex-syntax\n"
" Partial support for Flex syntax: in this mode named definitions don't need the equal sign and the terminating semicolon, and when used they must be surrounded by curly braces. Names without curly braces are treated as\n"
" double-quoted strings.\n"
"\n"
" -g --computed-gotos\n"
" Optimize conditional jumps using non-standard \"computed goto\" extension (which must be supported by the C/C++ compiler). re2c generates jump tables only in complex cases with a lot of conditional branches. Complexity\n"
" threshold can be configured with cgoto:threshold configuration. This option implies -b.\n"
"\n"
" -I PATH\n"
" Add PATH to the list of locations which are used when searching for include files. This option is useful in combination with /*!include:re2c ... */ directive. Re2c looks for FILE in the directory of including file and\n"
" in the list of include paths specified by -I option.\n"
"\n"
" -i --no-debug-info\n"
" Do not output #line information. This is useful when the generated code is tracked by some version control system or IDE.\n"
"\n"
" --input <default | custom>\n"
" Specify re2c input API. Option default is the default API composed of pointer-like primitives YYCURSOR, YYMARKER, YYLIMIT etc. Option custom is the generic API composed of function-like primitives YYPEEK(), YYSKIP(),\n"
" YYBACKUP(), YYRESTORE() etc.\n"
"\n"
" --input-encoding <ascii | utf8>\n"
" Specify the way re2c parses regular expressions. With ascii (the default) re2c handles input as ASCII-encoded: any sequence of code units is a sequence of standalone 1-byte characters. With utf8 re2c handles input as\n"
" UTF8-encoded and recognizes multibyte characters.\n"
"\n"
" --location-format <gnu | msvc>\n"
" Specify location format in messages. With gnu locations are printed as 'filename:line:column: ...'. With msvc locations are printed as 'filename(line,column) ...'. Default is gnu.\n"
"\n"
" --no-generation-date\n"
" Suppress date output in the generated file.\n"
"\n"
" --no-lookahead\n"
" Use TDFA(0) instead of TDFA(1). This option has effect only with --tags or --posix-captures options.\n"
"\n"
" --no-optimize-tags\n"
" Suppress optimization of tag variables (useful for debugging).\n"
"\n"
" --no-version\n"
" Suppress version output in the generated file.\n"
"\n"
" -o OUTPUT --output=OUTPUT\n"
" Specify the OUTPUT file.\n"
"\n"
" -P --posix-captures\n"
" Enable submatch extraction with POSIX-style capturing groups.\n"
"\n"
" --posix-closure <gor1 | gtop>\n"
" Specify shortest-path algorithm used for construction of epsilon-closure with POSIX disambiguation semantics: gor1 (the default) stands for Goldberg-Radzik algorithm, and gtop stands for \"global topological order\" algo‐\n"
" rithm.\n"
"\n"
" -r --reusable\n"
" Allows reuse of re2c rules with /*!rules:re2c */ and /*!use:re2c */ blocks. Exactly one rules-block must be present. The rules are saved and used by every use-block that follows, which may add its own rules and configu‐\n"
" rations.\n"
"\n"
" -S --skeleton\n"
" Ignore user-defined interface code and generate a self-contained \"skeleton\" program. Additionally, generate input files with strings derived from the regular grammar and compressed match results that are used to verify\n"
" \"skeleton\" behavior on all inputs. This option is useful for finding bugs in optimizations and code generation.\n"
"\n"
" -s --nested-ifs\n"
" Use nested if statements instead of switch statements in conditional jumps. This usually results in more efficient code with non-optimizing C/C++ compilers.\n"
"\n"
" -T --tags\n"
" Enable submatch extraction with tags.\n"
"\n"
" -t HEADER --type-header=HEADER\n"
" Generate a HEADER file that contains enum with condition names. Requires -c option.\n"
"\n"
" -u --unicode\n"
" Generate a lexer that reads UTF32-encoded input. Re2c assumes that character range is 0 -- 0x10FFFF and character size is 4 bytes. This option implies -s.\n"
"\n"
" -V --vernum\n"
" Show version information in MMmmpp format (major, minor, patch).\n"
"\n"
" --verbose\n"
" Output a short message in case of success.\n"
"\n"
" -v --version\n"
" Show version information.\n"
"\n"
" -w --wide-chars\n"
" Generate a lexer that reads UCS2-encoded input. Re2c assumes that character range is 0 -- 0xFFFF and character size is 2 bytes. This option implies -s.\n"
"\n"
" -x --utf-16\n"
" Generate a lexer that reads UTF16-encoded input. Re2c assumes that character range is 0 -- 0x10FFFF and character size is 2 bytes. This option implies -s.\n"
"\n"
" -W Turn on all warnings.\n"
"\n"
" -Werror\n"
" Turn warnings into errors. Note that this option alone doesn't turn on any warnings; it only affects those warnings that have been turned on so far or will be turned on later.\n"
"\n"
" -W<warning>\n"
" Turn on warning.\n"
"\n"
" -Wno-<warning>\n"
" Turn off warning.\n"
"\n"
" -Werror-<warning>\n"
" Turn on warning and treat it as an error (this implies -W<warning>).\n"
"\n"
" -Wno-error-<warning>\n"
" Don't treat this particular warning as an error. This doesn't turn off the warning itself.\n"
"\n"
" -Wcondition-order\n"
" Warn if the generated program makes implicit assumptions about condition numbering. One should use either the -t, --type-header option or the /*!types:re2c*/ directive to generate a mapping of condition names to numbers\n"
" and then use the autogenerated condition names.\n"
"\n"
" -Wempty-character-class\n"
" Warn if a regular expression contains an empty character class. Trying to match an empty character class makes no sense: it should always fail. However, for backwards compatibility reasons re2c allows empty character\n"
" classes and treats them as empty strings. Use the --empty-class option to change the default behavior.\n"
"\n"
" -Wmatch-empty-string\n"
" Warn if a rule is nullable (matches an empty string). If the lexer runs in a loop and the empty match is unintentional, the lexer may unexpectedly hang in an infinite loop.\n"
"\n"
" -Wswapped-range\n"
" Warn if the lower bound of a range is greater than its upper bound. The default behavior is to silently swap the range bounds.\n"
"\n"
" -Wundefined-control-flow\n"
" Warn if some input strings cause undefined control flow in the lexer (the faulty patterns are reported). This is the most dangerous and most common mistake. It can be easily fixed by adding the default rule * which has\n"
" the lowest priority, matches any code unit, and consumes exactly one code unit.\n"
"\n"
" -Wunreachable-rules\n"
" Warn about rules that are shadowed by other rules and will never match.\n"
"\n"
" -Wuseless-escape\n"
" Warn if a symbol is escaped when it shouldn't be. By default, re2c silently ignores such escapes, but this may as well indicate a typo or an error in the escape sequence.\n"
"\n"
" -Wnondeterministic-tags\n"
;
|