yex.parse.Parser
yex.parse.Parser(source, bounded=Bounding.NO, level=RunLevel.EXECUTING, on_eof=OnEof.NONE, no_outer=False)
#
Interprets a TeΧ file, and expands its macros.
Takes a source, and iterates over it, returning the tokens with the macros expanded according to the definitions stored in the Document attached to that source.
By default, Parser will keep returning None forever,
which is what you want if you're planning to do
lookahead. If you're going to put this Parser into
a for loop, you'll want to set on_eof=OnEof.EXHAUST.
It's fine to attach another Parser to the same source, and to run it even when this one is active.
Attributes:
| Name | Type | Description |
|---|---|---|
source |
typing.Union[yex.parse.Tokeniser, typing.TextIO, typing.List, str]
|
the source |
doc |
yex.Document
|
the document we're helping create. |
bounded |
yex.parse.parser.Bounding
|
how far to run an Expander before we stop.
If this is "balanced" or "single", it requires |
level |
yex.parse.parser.RunLevel
|
the level to run at; see the documentation for RunLevel for further information. Default is RunLevel.EXECUTING. |
on_eof |
yex.parse.parser.OnEof
|
what to do if we reach the end of the file. |
no_outer |
bool
|
if True, attempting to call a macro which was defined as "outer" will cause an error. Defaults to False. |
location |
typing.Union[yex.parse.Location, None]
|
the current position of this expander, or None if we're not tracking a position. |
delegate |
typing.Union[yex.parse.Expander, None]
|
if this is not |
running |
bool
|
True if we're still running; False if we've reached the end of the part we're looking at. |
is_expanding |
bool
|
whether this Expander is currently expanding tokens. If the runlevel is below EXPANDING, we are never expanding. If it's EXPANDING or higher, then we are expanding iff we are not forbidden to expand by a conditional. For example, even if level was EXPANDING, we wouldn't be expanding
straight after |
Source code in yex/parse/parser.py
208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 | |
SPIN_LIMIT = 1000
class-attribute
instance-attribute
#
Maximum number of times we can allow a parser to return
None before we give up on it.
another(subclass=None, preserve_step_bounding=False, **kwargs)
#
Returns a parser like this one, with given changes to its behaviour.
The result will be a parser on the same Tokeniser. If there are no changes requested, or if the changes requested make no difference, the result will be this same Parser; otherwise it will be a new Parser.
Any setting specified in kwargs will be honoured,
with the exception of bounded -- see below about that.
All other settings will be copied from this Parser.
How bounded works:
- If bounded is specified in kwargs, the new parser
will have the specified value.
- Otherwise, if preserve_step_bounding is True, and
self.bounded=="step", the new parser will also
have bounded="step".
- Otherwise, the new parser will always have bounded="no".
Consider
This might be better suited to a factory method, "from_another", to produce an instance of the class it's called on.
Source code in yex/parse/parser.py
279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 | |
eat_optional_spaces(level=RunLevel.DEEP)
#
Eats zero or more space tokens.
This is like Tokeniser.eat_optional_spaces(), except that it can also execute controls and active characters, then continue to consider the result.
Returns a list of the Tokens consumed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
yex.parse.parser.RunLevel
|
the runlevel to run at. |
yex.parse.parser.RunLevel.DEEP
|
Source code in yex/parse/parser.py
951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 | |
end()
#
Marks this Parser as finished.
Source code in yex/parse/parser.py
1074 1075 1076 1077 1078 1079 1080 | |
get_digit_sequence(accept_ch, accept_decimal_point)
#
Reads and returns a series of symbols.
The result is taken from the next zero or more items. They are accepted if:
- they are LETTER or OTHER tokens, and their "ch" property is
in
accept_ch; or - they are single-character strings, and they are in
accept_ch.
This exists because if we read in the indexes of arrays using
any other method, we risk \catcodeNN= affecting the way the symbol
after the value which is assigned to \catcodeNN.
See test_tokeniser_whitespace_after_control_words().
Tokens are represented in the result by their ch property.
Strings are used directly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
accept_ch
|
str
|
the characters we can accept |
required |
accept_decimal_point
|
bool
|
if |
required |
Returns:
Source code in yex/parse/parser.py
987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 | |
next(**kwargs)
#
Returns the next item.
This is just like next() on an iterator, but with more options. (And indeed, our iterators are implemented in terms of this method.)
Args are as for another().
Raises:
| Type | Description |
|---|---|
UnexpectedEOFError
|
on unexpected end of file, or if
|
Source code in yex/parse/parser.py
351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 | |
peek()
#
Returns the item which is next due to be returned by next().
If this would go past the end of the file, we return None,
whatever the setting of on_eof.
Source code in yex/parse/parser.py
830 831 832 833 834 835 836 837 838 839 840 | |
push(thing, clean_char_tokens=False, is_result=False)
#
Pushes back a token, a character, or anything else.
This is mostly just a wrapper for the push method in
Tokeniser. But we do check for "beginning group"
and "ending group" tokens, and adjust our fields accordingly.
All Parsers share pushback, and in general it's fine to push things through a parser when you received them from a different Parser. The only exception to this is when you're using balanced expansion: because we have to keep a count of balanced braces, you should remember to push Tokens back through the Parser that gave you them.
If you push bare characters, they will be converted by the source as it thinks appropriate.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
thing
|
yex.parse.tokeniser.Any
|
whatever you're pushing back. Pushing None will be ignored. If this is a string, or a list specifically, it will be split into its members and pushed in reverse order. For example, pushing 'cat' is the same as pushing 't', then pushing 'a', then pushing 'c'. |
required |
clean_char_tokens
|
bool
|
if True, all bare characters will be converted to the Tokens for those characters.s (For example, 'T', 'e', 'X' -> ('T' 12) ('e' 12) ('X' 12).) The rules about how this is done are on p213 of the TeΧbook. If False, the characters will remain bare characters and the source will tokenise them as usual when it gets to them. |
False
|
is_result
|
bool
|
If you're a control, and your job involves reading some data, then pushing a result, set this to True when you push the result. This will allow \expandafter to work correctly. If you're implemented through a decorator, and your result is pushed via returning it, you don't have to worry: the decorator will set is_result=True when it pushes your return values. |
False
|
Raises:
| Type | Description |
|---|---|
EOFError
|
if this parser is exhausted. |
GoneBeforeTheBeginningError
|
if we're bounded, and you push more BEGINNING_GROUP tokens than you've already received. |
Source code in yex/parse/parser.py
867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 | |